Web scraping doesn’t violate anti-hacking law, appeals court rules

  News
image_pdfimage_print
A smiling man in an open-collar suit.
Enlarge / LinkedIn CEO Jeff Weiner.

Scraping a public website without the approval of the website’s owner isn’t a violation of the Computer Fraud and Abuse Act, an appeals court ruled on Monday. The ruling comes in a legal battle that pits Microsoft-owned LinkedIn against a small data-analytics company called hiQ Labs.

HiQ scrapes data from the public profiles of LinkedIn users, then uses the data to help companies better understand their own workforces. After tolerating hiQ’s scraping activities for several years, LinkedIn sent the company a cease-and-desist letter in 2017 demanding that hiQ stop harvesting data from LinkedIn profiles. Among other things, LinkedIn argued that hiQ was violating the Computer Fraud and Abuse Act, America’s main anti-hacking law.

This posed an existential threat to hiQ because the LinkedIn website is hiQ’s main source of data about clients’ employees. So hiQ sued LinkedIn, seeking not only a declaration that its scraping activities were not hacking but also an order banning LinkedIn from interfering.

A trial court sided with hiQ in 2017. On Monday, the 9th Circuit Appeals Court agreed with the lower court, holding that the Computer Fraud and Abuse Act simply doesn’t apply to information that’s available to the general public.

“The CFAA was enacted to prevent intentional intrusion onto someone else’s computer—specifically computer hacking,” a three-judge panel wrote. The court notes that members debating the law repeatedly drew analogies to physical crimes like breaking and entering. In the 9th Circuit’s view, this implies that the CFAA only applies to information or computer systems that were private to start with—something website owners typically signal with a password requirement.

The court notes that when the CFAA was first enacted in the 1980s, it only applied to certain categories of computers that had military, financial, or other sensitive data.

“None of the computers to which the CFAA initially applied were accessible to the general public,” the court writes. “Affirmative authorization of some kind was presumptively required.”

When the law was extended to more computers in 1996, a Senate report said its goal was to “increase protection for the privacy and confidentiality of computer information.” As a result, the 9th Circuit reasons “the prohibition on unauthorized access is properly understood to apply only to private information—information delineated as private through use of a permission requirement of some sort.”

By contrast, hiQ is only scraping information from public LinkedIn profiles. By definition, any member of the public has authorization to access this information. LinkedIn argued that it could selectively revoke that authorization using a cease-and-desist letter. But the 9th Circuit found this unpersuasive. Ignoring a cease-and-desist letter isn’t analogous to hacking into a private computer system.

LinkedIn can’t interfere with hiQ’s scraping

Monday’s ruling goes beyond finding that hiQ hasn’t violated anti-hacking laws. The appeals court also upheld a lower court order banning LinkedIn from interfering with hiQ’s scraping activities during the course of the litigation.

HiQ argues that if LinkedIn started using technical measures to block hiQ from scraping LinkedIn user data, it would be interfering with hiQ’s contracts with its own customers, who rely on that data (this is known in legal jargon as “tortious interference with contract”). HiQ noted that LinkedIn tacitly accepted hiQ’s scraping activities for several years—even sending representatives to hiQ conferences where hiQ openly explained that its products were built on LinkedIn data.

LinkedIn argued that it needed to restrict scraping to protect its own users’ privacy. But hiQ countered that the data didn’t belong to LinkedIn but to its users—who explicitly marked the data public. HiQ doesn’t scrape non-public LinkedIn profiles. HiQ also says that LinkedIn only objected to hiQ’s scraping around the time LinkedIn launched its own analytics tools that competed with hiQ’s offerings.

Taking all of this into account, the appeals court ruled that hiQ had a good chance of proving that it was entitled to continue scraping LinkedIn’s data during the legal battle. LinkedIn might still prevail on this point, eventually giving the company the right to blacklist hiQ’s scrapers. But the appeals court upheld the lower court’s order that LinkedIn wait until after the court case was over before doing so. Otherwise, hiQ might be out of business before it had a chance to finish the case.

This remains an unsettled area of the law. In 2013, for example, a California federal trial court held that two companies may have violated the Computer Fraud and Abuse Act when they scraped data from Craigslist in order to provide user-friendly alternatives for viewing Craigslist listings.

Like the hiQ case, this early one occurred in California, so appeals would have been heard by the 9th Circuit—which might have been sympathetic to their case. But the defendants didn’t get that far—they settled the case in 2015 and agreed to stop scraping Craigslist content. Monday’s ruling suggest that they might have actually prevailed on appeal.

https://arstechnica.com/?p=1564309