Elon Musk’s Gifts to Web Scrapers (Guest Blog Post)

By Kieran McCarthy

Elon Musk may have done more to open the Internet to web scraping than any person or public interest advocacy group.

Not that he meant to do this, mind you. He was trying to do the opposite. But by providing a foil in litigation against both the Center for Countering Digital Hate (“CCDH”) and Bright Data (the world’s largest seller of scraped data), he’s given judges in the most important district court in the country for tech legal issues, the Northern District of California, plenty of motivation to rule against him. As a result, judges have provided two landmark opinions in the last 45 days in favor of web scrapers. This creates powerful new precedent that will make it easier for web scrapers to prevail in litigation and will make it much harder for websites to prevent scraping.

Bright Data has long sold the data of all the major social media companies. In November 2023, X corp. sued Bright Data for trespass to chattels, breach of contract, tortious interference with a contract, violation of California Business and Professions Code Section 17200, and misappropriation.

Recently, Bright Data had its motion to dismiss granted against X Corp. on all counts. X Corp. v. Bright Data Ltd., 3:23-cv-03698 (N.D. Cal. May 9, 2024).

The court held:

Our court of appeals has held that giving social media companies “free rein to decide, on any basis, who can collect and use data — data that the companies do not own, that they otherwise make publicly available to viewers, and that the companies themselves collect and use — risks the possible creation of information monopolies that would disserve the public interest.” hiQ Labs, Inc. v. LinkedIn Corp., 31 F.4th 1180, 1202 (9th Cir. 2022). With that in mind, this district court carefully considered each of the claims asserted. It now concludes that none of the claims passes muster.

X Corp. v. Bright Data at 2.

If you’re an Fortune 500 company looking to stop scraping, and the court drops this paragraph into the opinion, you are in for a bad day.

The court bifurcated its analysis into two parts: Twitter’s claims related to allegedly improper access to its systems and its claims regarding allegedly improper selling of scraped data.

For the improper access claims, the court dismissed the trespass to chattels, tortious interference with a contract, and breach of contract claims on the basis that Twitter failed to allege “resulting damages” associated with Bright Data’s access.

With respect to the tortious interference and breach of contract claims, the court found that the complaint only contained allegations of damages related to the selling of scraped data. According to the court, there were no specific allegations that the access of its systems had caused resulting damages to Twitter. And while the court would find other reasons to dismiss the claims related to selling of scraped data, the claims for improper access were dismissed because there were no alleged damages associated with them.

For the trespass to chattels claim, the court quoted Hamidi in saying that “[s]hort of dispossession, personal injury, or physical damages (not present here), intermeddling is actionable only if the chattel is impaired as to its condition, qualify, value, or. . . the possessor is deprived of the use of the chattel for a substantial time.” Hamidi at 306. Twitter’s complaint contained no allegations of such impairment or deprivation, according to the court. I agree with the court on this, but it seems inconsistent with some other recent opinions that found that a TTC claim had been properly pleaded, despite threadbare allegations.

The Section 17200 claim was dismissed because the court did not find any fraudulent conduct that caused harm to Twitter associated with Bright Data’s access to its systems.

For the claims related to illegal selling of scraped data, the court dismissed those because they were preempted by copyright.

I did a deep dive on this topic in December, but the general gist of it is that copyright law preempts state law claims if the state-law claims come within the general scope of copyright.

Here, the court took a slightly different approach to this problem compared to what I have seen in the past.

As recognized by our court of appeals, however, “claims are not preempted if they fall outside the scope of [Section] 301(a)’s express preemption and are not otherwise in conflict with the Act.” Ryan v. Editions Ltd. W., Inc., 786 F.3d 754, 760 (9th Cir. 2015) (emphasis added). Although conflict preemption has played second fiddle to express preemption in the caselaw as of late, it is the more appropriate consideration when the question presented is not whether rights created by state law are equivalent to rights created by federal copyright law but whether enforcement of state law undermines federal copyright law.  As legal commentators have observed for decades, the question is often teed up where, as here, state-law claims draw upon a standard form contract. That rights created by contract law are not “equivalent” to rights created by copyright law does not mean that copyright law will never come into conflict with broad-based contractual terms. (citations omitted)

X Corp. v. Bright Data at 21.

As I wrote back in December, scraping claims are almost always about unwanted copying and distribution of data. That’s what copyright law is for. And while any competent attorney can draw up language in a ToS to skirt any literal equivalency standard, allowing them to avoid copyright preemption based on a one-sided ToS absolutely undermines copyright law. And that, in turn, further restricts what should be in the public domain and gives companies power to create property rights where none are otherwise granted in the law.

Here, the court agreed, and dismissed Twitter’s breach-of-contract claims on that basis.

Lastly, the court dismissed a breach-of-contract claim based on an amendment to its terms of service that had occurred after Twitter had initiated litigation, because, well, that doesn’t seem like a reasonable thing for a plaintiff to be allowed to do.

If the Northern District of California’s recent rulings with Bright Data were a bit of a surprise, its recent decision in the X Corp. v. CCDH case was not.

Sometimes it is unclear what is driving a litigation, and only by reading between the lines of a complaint can one attempt to surmise a plaintiff’s true purpose. Other times, a complaint is so unabashedly and vociferously about one thing that there can be no mistaking that purpose. This case represents the latter circumstance. This case is about punishing the Defendants for their speech.

X Corp. v. Center for Countering Digital Hate, 2024 WL 1246318, at 1 (N.D. Cal. March 25, 2024).

And so began Charles Breyer’s scathing 52-page opinion in the recent X Corp. v. Center for Countering Digital Hate, Inc. et al case.

When it first was published, Eric did an excellent job summarizing the opinion in this post. But I wanted to take a slightly different angle in my analysis of the case.

My primary takeaway from this case is that it is another absolute gift to web scrapers. It paved the way for yesterday’s opinion. And it will pave the way for many more similar opinions in the future.

Most people think that the hiQ Labs opinion was the most favorable legal decision for web scrapers, except for the fact that the final disposition was exactly the opposite. The Meta v. Bright Data case from earlier this year was also a monumental win for web scrapers. But I think that this opinion is even more significant. Whereas the hiQ Labs and Bright Data were qualified wins on unique fact patterns with uncertain ramifications for other cases, Breyer’s opinion in this case makes clear pronouncements that can be applied to other web-scraping conflicts.

Historically, the two hardest claims for web scrapers to overcome in litigation were breach of contract and the Computer Fraud and Abuse Act (“CFAA”). In the X Corp v. CCDH case, Musk brought both claims. Breyer dispatched with both. And the reasoning for why these claims were dismissed should apply to plenty of other litigants.

First, the facts:

As a user of the X platform, CCDH necessarily agreed to X Corp.’s Terms of Service (“ToS”) when it created a new account in 2019. The ToS provided that “‘scraping the Services without the prior consent of Twitter is expressly prohibited.’” The ToS did not define “scraping.” However, scraping generally means “extracting data from a website and copying it into a structured format, allowing for data manipulation or analysis.” See hiQ Labs, Inc. v. LinkedIn Corp., 31 F.4th 1180, 1186 n.4 (9th Cir. 2022) (hereinafter “hiQ 2022 Circuit opinion”); see also hiQ Labs, Inc. v. LinkedIn Corp., 639 F. Supp. 3d 944, 954 (N.D. Cal. 2022) (hereinafter “hiQ 2022 district opinion) (defining “scraping” as “a process of extracting information from a website using automated means”).

X Corp. v. CCDH at 2.

This fact pattern has often been curtains for web scrapers. Few scraping defendants would have dared to litigate these facts.

But given the quote at the top of this post, it’s not surprising that Breyer was motivated to find for the defendant. And it’s not surprising that advocacy groups like the ACLU and the EFF were eager to speak out in defense of CCDH.

But whatever the policy inclinations of judges and advocacy groups, the reality of the case law makes it difficult to distinguish between pro-social groups like CCDH, run-of-the-mill commercial entities, and even gray area or dark web-type scrapers. Pruneyard-based First Amendment and public square-type arguments have failed in prior cases. When the party seeking to enforce a terms of use violation is a private litigant, First Amendment arguments are usually dead ends.

The challenge with scraping litigation in the context of free speech is how to weed out the pro-social scrapers from the not-so-pro-social scrapers.

I think Breyer did a great job threading the needle.

His solution was brilliant in its simplicity. For both the breach of contract and CFAA claims, he focused on whether the conduct that gave rise to the claims created cognizable damages and/or harm. In both instances, he concluded they did not.

For the breach of contract claim, he wrote:

The breach that X Corp. alleges here is CCDH’s scraping of the X platform. X Corp. does not allege any damages stemming directly from CCDH’s scraping of the X platform. X Corp. seeks only damages based on the reactions of advertisers (third parties) to CCDH’s speech in the Toxic Twitter report, which CCDH created after the scraping. See FAC ¶¶ 70, 78; see also ACLU Br. at 12 (“The damages X Corp. seeks . . . are tied to reputational harm only, with no basis in any direct physical, operational or other harm that CCDH U.S.’s alleged scraping activities inflicted on X Corp.”). That is just what the Fourth Circuit disallowed in Food Lion, 194 F.3d at 522. The speech was not the breach, as it was in Cohen.

The alleged “breach” here was scraping. The alleged “harm” here was a report that painted Twitter in an unflattering light and caused Twitter to purportedly lose advertising revenue. But while the harm might have been a downstream consequence of the breach, it did not stem directly from it.

For the CFAA claim, the rationale for dismissing X Corp.’s claim was nearly identical. Under the CFAA, to state a claim, one must allege a “loss” or damages of more than $5,000. Traditionally, this has been an easy threshold to meet.

Breyer focused on a lesser-known passage from the 2021 Van Buren Supreme Court opinion to dismiss Twitter’s CFAA claim.

CCDH argues that X Corp.’s loss allegations are inadequate because they are not losses from technical harms. That is indeed a requirement under the CFAA, as the Supreme Court held in Van Buren v. United States, 141 S. Ct. 1648, 1659–60 (2021). The Court explained that “[t]he term ‘loss’ . . . relates to costs caused by harm to computer data, programs, systems, or information services.” Despite X Corp.’s suggestion at the motion hearing that only damages must relate to “damage to the system,” the Court in Van Buren stated that both damages and loss under the CFAA “focus on technological harms—such as the corruption of files—of the type unauthorized users cause to computer systems and data,” Van Buren, 141 S. Ct. at 1660. It observed that “[l]imiting ‘damage’ and ‘loss’ in this way makes sense in a scheme ‘aimed at preventing the typical consequences of hacking.’” Id. (quoting Royal Truck & Trailer Sales & Serv., Inc. v. Kraft, 974 F.3d 756, 760 (6th Cir. 2020))….

X Corp.’s losses in connection with “attempting to conduct internal investigations in efforts to ascertain the nature and scope of CCDH’s unauthorized access to the data,” are not technological in nature. The data that CCDH accessed does not belong to X Corp., see Kaplan Decl. Ex. A at 13 (providing that users own their content and grant X Corp. “a worldwide, non-exclusive, royalty-free license”), and there is no allegation that it was corrupted, changed, or deleted. Moreover, the servers that CCDH accessed are not even X Corp.’s servers.

This seems simple and straightforward, and I agree with its reasoning. But it may also be a departure from prior precedent.

Notably, when viewed under the lens of this opinion and rationale, how would Power Ventures be decided if it were litigated today? Facebook v. Power Ventures involved a social media aggregator’s consensual use of its users’ Facebook passwords to access their Facebook accounts. The service that Power Ventures sold was a platform to manage multiple social media platforms together. Facebook objected and sent a cease-and-desist letter. According to the Ninth Circuit panel, after receiving a cease-and-desist letter, Power Ventures could not access Facebook accounts with “authorization” under the CFAA and was thus violating the CFAA, even though they still had authorization from Facebook’s users to access their accounts with shared credentials.

I don’t think Power Ventures caused any technological harm to Facebook’s computers or user data. I don’t think there were allegations in the case that user data was “corrupted, changed, or deleted.” It was accessed according to the stated preferences of users, but against Facebook’s wishes. That gave rise to a CFAA claim in 2016, and I can’t square that opinion with this one. Either the Power Ventures opinion is no longer good law or there is a way to reconcile the two opinions that is not obvious to me.

I think Power Ventures was wrongly decided, and so I think Breyer’s rationale is correct here. The CFAA isn’t for creating walled gardens; it’s to prevent harmful conduct caused by hacking. If the conduct isn’t a legally cognizable harm in the first place, it shouldn’t give rise to a CFAA claim.

There are many forms of scraping and unwanted access that are harmful, and those should still give rise to CFAA and breach of contract claims. But in instances like this, where we have a protective website that’s suing because a wealthy man’s feelings got hurt when the truth was told in a public forum, that should not be actionable under the law.