Tuesday, February 22, 2022

Copyright Office Technical Measures Consultations

The other sessions overlapped with my teaching, unfortunately, including Register Perlmutter’s opening remarks. 

11:00AM – 12:30PM Session 2

Meta, Rob Arcamona: Meta takes © and creativity very seriously, developed content protection tools beyond legal obligation. Developed hand in hand w/content owners; voluntary cooperation is successful.

Microsoft, Lauren Chamblee: Historically engaged w/rightsholders for © concerns. Four key points: (1) One side fits all approach is burdensome for different stakeholders depending on level of control over content, different platforms attracting different content—search engine and cloud computing policies differ. (2) certain measures are harder to implement in cloud space. Customers/sensitive industries like healthcare, finance, and banking that use encryption. Cloud providers can’t implement certain measures that might work elsewhere. (3) Implementation can present resource challenges for smaller platforms/platforms w/ few infringement challenges. Forcing them to reallocate resources not justified by the infringement on their platforms can be bad and lead to overblocking. (4) incentivize innovative solutions addressing specific concerns on specific platforms rather than adopting generic solutions that don’t work in context.

GitHub, Rose Coogan: More than 73 million creators—students, developers, startups, NGOs, governments using GitHub to collaborate on open source. Large part of what makes developer community from other communities. Not traditional © content. Unique expertise on code too; code is different from photos, music, videos; some of the most valuable code on GitHub is licensed openly.  Remediation not removal is often the goal—changes to the code rather than removing often resolves the problem, e.g. addressing violation of open source license by adding attribution etc. under terms of relevant license. Taking down code is significant; consequences of removal often disproportionate to interests of © owner; can break lots of programs that depend on availability of that code. Analysis of infringement claims for code can be not very simple. Tech measures at scale will rarely if ever be acceptable proxy for analyzing infringement claims.

CTIA, Kathryn Dall’Asta: Wireless communication industry. Flexible, not one size fits all.

Wikimedia Foundation, Jacob Rogers: DMCA compliance lead for 7 years. Nonprofit hosts several projects including Wikipedia, 100s of 1000s of users contributing content around the world. Designed to be freely available licensed or public domain; we occasionally use fair use images where no free image is available, such as when a famous work has been destroyed. Careful justifications/written record when we make those exceptions. Limitations of tech measures: our users have their own processes for reviewing © works and their own tools for flagging for potential review, leading to very small number of DMCA notices. When we do get them, esp from automated tools, they are very low quality. Less than 1 in 5 notices are valid for our projects. Tend to discourage/demoralize our volunteers, so we review very carefully and reject when invalid. Encourage CO to think carefully about smaller nonprofit and educational uses.

Organization for Transformative Works, Rebecca Tushnet: The Organization for Transformative Works (“OTW”) is a nonprofit established to protect and defend fans and fanworks from commercial exploitation and legal challenge. The OTW’s nonprofit, volunteer-operated website hosting transformative noncommercial works, the Archive of Our Own, was launched in late 2009. It has over 4.2 million registered users, hosts over 8.5 million unique works, and receives approximately 440 million page views per week, on a budget of slightly over half a million dollars per year. [I made a NYT comparison but other sources disagree with the numbers I found—so it’s probably most fair to say that we’re in roughly the same class as the NYT for visits.]

Our story explains why it is a mirage to float the idea of standard technical measures adequate to distinguish infringing materials from non-infringing materials on the Archive of Our Own and similar websites. Like most large websites that are not in the news every day, we receive very few copyright claims despite our size. I have been working with the OTW since its inception, and we have never received more than a handful of copyright claims per year. Our modal DMCA notice is from a person who deleted their account without removing their works and now wants them removed even though they initially posted the works to the site. We do receive copyright claims based on titles or trademarks, which are invalid, and while we occasionally receive a valid claim not based on an initially authorized posting, our records indicate that we have never received a second copyright notice based on rights in the same complaining work.

Given all this, the idea of using technical measures to prescreen content is foolish and counterproductive. It would impose substantial costs that we don’t have the resources to afford and would have prevented literally zero of the claims we have received over 13 years in existence, sacrificing 8.5 million new creations for no purpose. We are unusually successful for a nonprofit website, but we are not unusual in the nonexistent benefit of technological filtering or our inability to afford such features—the cost-benefit analysis for websites like ours is just completely different from the cost-benefit analysis of charismatic megasites like Facebook.

In addition, our constituents routinely report that they have trouble making transformative fair uses elsewhere on sites like YouTube, notwithstanding that they are fair uses, which filtering software cannot detect. We are, for example, deeply concerned about identifying Scribd’s BookID as a potential model, when Scribd has rightfully acknowledged its significant limitations, including its routine misidentification of public domain materials and quotes that get caught in its filters. The same problems arose for Audible Magic, especially with the pandemic shift to live streamed performances, when classical concerts were wrongly flagged as copies of existing recordings, and for Content ID, showing that misidentification is a pervasive problem without a technical fix.

RIAA, Victoria Sheckler: Internet is amazing but is also a vehicle for distributing pirated content, unfairly competing w/licensed content. Appreciate voluntary technical measures developed to ID sound recordings and determine what to do w/them.

Cloudflare, Alissa Starzak: Tech measures used by platforms do not work for infrastructure providers and could have profound impacts on privacy and security. Cloudflare takes voluntary tech measures that would be completely inapplicable to others—rightsholders object that we protect sites from attack even though we don’t host; we can route the complaints to the host. Natural instinct is that one measure should be equally applicable to all sorts of providers, but that mistakes the diversity in the system.

Black Music Action Coalition, Willie “Prophet” Stiggers: Founded to combat racism in the music industry. Appreciate CO efforts; looking to protect Black artists.

Pex, Megumi Yukie: Provides technical measures for ID’ing, protecting and licensing © works through Attribution Engine.

Q: how do current tech measures affect users, rightsholders, providers?

Scheckler: the existing measures help radically in id’ing content, facilitating license/use, protecting/stopping unauthorized use. TDEX is a multistakeholder body creating better licensing of music. Automated permission technologies include third-party services like Audible Magic and Pex and proprietary systems like Content ID. None are perfect but dramatically ease friction.

Chamblee: some are applicable to platforms that host UGC that has a lot of rightsholder content. Harder on, e.g., enterprise platform where providers don’t have access to content hosted on their site. Can pass on complaints to customer w/o having access to content. Tech measures that are useful in one place aren’t always useful in other contexts.

Scheckler: not a one size fits all policy. What works on a search engine won’t necessarily work on UGC platform. But that doesn’t end the inquiry. We do have certain APIs w/large search engines to assist them identifying infringement.

Starzak: there are lots of kinds of measures; the CO notice was focused on large platforms’ uses. But tech measures that allow rightsholders to submit claims are part of the ecosystem too. We have a variety of different players. The idea of standardization under legal process is very different than voluntary mechanism. Once a gov’t process starts, it starts to look like a legal regime and not a voluntary regime. Goal is to think about different voluntary measures, not one size fits all.

Rob Arcamona: interested in continuing voluntary cooperation. Infringement is against our TOS, and developing new system is in our users’ interests, since they are often creators; we want to create a system that works for them. Not only are some services different from social media, one social media platform may differ very much from each other. [Respect for getting FB’s antitrust message out in this space!] The types of speech and the types of infringement that occur on one may not occur on another. Narrowly tailored solutions are the most effective.

Rogers: we have a similar thing at Wikimedia. One of the most requested tool from volunteers is a better tool to detect plagiarism. Want to assure our projects are freely available for use; the importance of variation is critical. When we see CO mentioning tools that rely on scanning works before upload, that raises lots of concerns; preempts open discussion and user processes used on Wikipedia and could be very disruptive.

Scheckler: think about implementation characteristics. Automated content recognition: how good is the tech to ID the work at issue, which depends on the type of work. Second, once you recognize that work, what are you going to do about it? That’s tailoring for specific platforms and users. Wants to distinguish b/t underlying tech and business rules for implementing it on any given platform.

Yukie: Pex sees benefits to all sides in using our tech measures. © holder can register assets and ID content and be fairly compensated; all sizes of rights holders can benefit, including indiv. creators who might not have same level of access/opportunity to monetize. Smaller and up and coming OSPs get a cost-effective solution that allows easy licensing and provide authorized content. Might not have the same access to level playing field in seeking licenses directly from rights owners. One of our clients: up and coming app that allows users to upload clips of selves dancing to popular music; friendly competition. To succeed, needs to allow DanceFight users to use the songs. Pex allows DanceFight to afford this in absence of large resources/name recognition. Users don’t have to worry about takedowns; can also register their © works in our asset registry.

Stiggers: Which artists’ organizations are at the table in these discussions? We haven’t been at the table for the most part.

Arcamona: We regularly meet with large studios and individual creators and organizations that represent individual creators including those on the panel today [does he mean RIAA?]. Development has to be deliberate.

Dall’Asta: Narrow tailoring b/c there are differences in implementation and what’s appropriate.

RT: Sorry to be a party pooper, one size doesn’t fit all is not a full description of the tradeoffs. In other fora, content owners complain viciously about Content ID missing stuff; I’m sure there is underblocking and there is also a lot of overblocking especially of fair use and public domain material (Fox runs a NASA clip and then NASA content gets blocked). So the tradeoff is not just are you big enough to justify using these measures but also these measures have serious costs to creators.

Coogan: One size fits all doesn’t work. We’ve found that automated notices are very rarely valid—that has to be taken into consideration.

Q: what features of new technical measures would be important to success?

Rogers: Note there are less scrupulous organizations that use content ID tools to harass people making appropriate lawful uses. Don’t just think about the technical tools, but think about what bad actors will do with the technical information they’re getting. An ID tool should avoid ID’ing people who are in cross hairs in the first place.

Arcamona: Have to develop with eye towards misuse, both intentional or unintentional (like Fox forgetting to exclude its NASA clip). Has to be deliberate.

Scheckler: there are differences b/t Wikipedia and fandom sites and sites that trade on music. We often see completely bogus counternotices—the focus on abuse has to be on both sides of the house. Tech measures can still move the ball forward.

RT: Designing for abuse is a great idea and not something that 512 and 1201 did. For example, privacy concerns with notices and counternotices. I’m sure there are abusive counternotices but what I’ve seen is about Google and Google does have side deals that allow rightsholders to override counternotices. We should have people who don’t focus on © but do focus on abuse talk to us about system design. One thing that would help: a clear statement from the CO that screening DMCA notices for quality does not disentitle a platform from the safe harbor w/r/t other instances of putative infringement.

Starzak: Laboratory of experiments is useful on the abuse and innovation sides.

Yukie: misuse and fair use: Pex has tried to address these w/processes. Fingerprinting and matching itself should be available to everyone. All rightsholders should be able to register and license content. But to properly implement that there need to be anti-misuse provisions. So we try to prevent rightsholders to prevent claiming content they don’t own, and have downstream dispute mechanisms.

RT: Germany’s proposed implementation of new © rules w/quantitative thresholds for filtering, mandatory ADR—nobody is talking about it, but some of the companies here might have to implement it. One Q: are they going to do it worldwide? That could obviate some parts of this proceeding and make other questions salient.

Chamblee: Europe does exclude enterprise service platforms and online source code sharing platforms, so this proceeding is broader.

Q: what role do you see tech measures playing in the future and how will that play out in costs/benefits for those involved in their use?

Starzak: what you’ve heard from everyone is that there are already a lot of tech measures already in play. They will continue to be developed. Q even on the European side is what does that look like and what effect does that have in the long term? That depends on implementation!

RT: Jennifer Urban et al—DMCA Plus and DMCA Classic—empirical work shows that business forces do drive innovation over and above and perhaps regardless of the legal framework. As Europe’s struggles with implementing its changes continue, it may be that it is very hard for law to affect what the tech does/can do. This is a space to monitor but perhaps law doesn’t have all that much to say.

Scheckler: costs will go down b/c that’s what time does to the cost of tech. [That’s not really true of costs of overblocking—it’s a limited definition of costs.]

Arcamona: Effectiveness is the key question, and that comes from customization and voluntary cooperation among industry members. Systems for incentivizing creativity and promoting it on our platform: we have an independent artist program allowing licensing on our platform, and we cover the costs for that—collaboration can protect © and expression.

Rogers: Not just seeing what tech evolves but what evolves to be effective is very important. Using algorithms to help humans do better work is likely to be more productive in long run than automated tools, which both over- and under-identify.

Coogan: we see a lot of third party submitters with an automated system that searches platform for keywords like company name, w/o considering context of use. W/o humans able to evaluate what comes out of the results, we see a lot of invalid takedowns, which goes to effectiveness. Existing tools are based on music or video and do not take into account things like code. [Preach!]

Chamblee: In certain platforms, fair use might be particular concerns—have to be tailored to the platform.

RT: Echo Rose Coogan’s statements on the last question: existing tools are optimized for video and music; super important point that hasn’t gotten much airtime, which is why we are so concerned about touting Scribd as a model. Compare Amazon experience where they told the CO at the 512 roundtables that over half of the takedowns they get for their Kindle Direct self publishing arm—also text—are invalid.

No comments: