Publishers Are Increasingly Irked at Perplexity Bots Circumventing Blocks
Take your media strategy to the next level at Mediaweek. Cultivate new media partnerships and gain tools to standout across platforms from experts at Youtube, Peloton, LTK and more. View agenda.
Publisher frustration is mounting at AI search startup Perplexity, which is backed by Jeff Bezos and other tech titans, for circumventing attempts to block its crawlers from accessing and serving up media content, potentially cutting publishers out of billions in ad revenue.
The New York Times, The Guardian, Condé Nast and Forbes are among those who have blocked PerplexityBot and Perplexity AI over the last five months from crawling their content and subsequently regurgitating it and, in the case of the claims by Forbes, earning ad revenue from it. (Perplexity doesn’t appear to serve ads, but plans to in the second half of this year).
Perplexity—valued at $1 billion per Bloomberg—is the latest AI thorn in publishers’ side. News sites started blocking OpenAI crawlers in 2023 following the tech’s surging popularity. Some are locked in legal battles for copyright theft, while others are striking deals.
But while some AI models like OpenAI and Google Overview acknowledge publishers blocking their crawlers, Perplexity doesn’t appear to. The search startup has a $20 monthly subscription offer and plans to strike revenue-sharing deals with publishers, it previously told ADWEEK.
The Guardian, which blocked Perplexity in March, wrote to the startup to request that it secure a commercial license for the use of its intellectual property, a company spokesperson told ADWEEK. Perplexity has yet to respond.
“The Guardian is not alone in seeing growing instances of theft of high-quality journalism by venture capital-backed software developers who believe that they can do so in plain sight,” the spokesperson said.
Forbes, which blocked Perplexity earlier this month after noticing copyright infringements, has demanded Perplexity remove infringing articles and reimburse the publisher for the ad revenue earned by the startup from displaying copies of its content. Perplexity CEO Aravind Srinivas defended the company’s practices on X, telling a Forbes journalist that the incident was due to a new feature with “rough edges” being refined with feedback.
Billions in lost ad revenue
Perplexity’s method of surfacing summaries and video content poses a significant threat to publishers’ display and video revenue, particularly from pre-roll videos on news sites, according to Ameet Shah, partner and svp of publisher operations and strategy at Prohaska Consulting.
Shah said the publishing industry will lose billions of dollars over the years. Across AI firms, publishers will likely lose north of $10 billion, he added.
“When publishers have blocks for ‘don’t use my content,’ and these engines can go around and monetize it, it’s the publishers’ IP that technically is getting monetized and further training the AI algorithms,” said Shah.
Perplexity has not responded to comment.
Workarounds
The Times started blocking Perplexity in February this year. However, an analysis by Thomas Höppner, partner at Hausfeld law firm, showed that Perplexity was surfacing answers from the publisher, some of which were from paywalled articles.
When asked, “What does NYT write about Germany vs. Scotland?” Perplexity reproduced content from The New York Times reporting on the Euro 2024 tournament, citing The Times as its source.
Although the question is not typically what someone would query on a search engine, Perplexity’s answer “Germany thrashed 10-man Scotland 5-1 in the opening match of Euro 2024,” closely mimics that of The Times, Höppner alleges.

“As the law and our terms of service make clear, scraping or using The Times’ content is prohibited without our prior written permission,” a Times spokesperson told ADWEEK. “We did not grant permission for the example cited here.”
The Times did not share whether it is planning to take legal action.
Condé Nast titles blocked Perplexity’s crawler via its robots.txt file earlier this year. Still, in some cases, especially on Wired, Perplexity summarizes answers similar to the publisher’s content.
When ADWEEK asked “What is the summer trend this year as per Vogue,” Perplexity gave this response, citing the Condé Nast-owned publication as its source: “Statement gowns are out, and minimalist chic with wardrobe staples like trench coats, pencil skirts, trouser suits, and good jeans is in. Looks tagged as “minimalism” were up 46% on the runway, while logo-heavy looks were down 52%.”
An article in Vogue dated June 17 2024 includes the very same details.
Potential copyright infringement
Since the introduction of the Robots Exclusion Protocol (REP) and the robots.txt blocking scheme in 1994, search engines have promised to respect this standard and avoid crawling blocked content, according to Höppner.
However, despite claiming otherwise, Perplexity seems to disregard these standards, Höppner claims.
“At least the larger players did keep their promise,” said Höppner. “By intentionally circumventing paywalls, [Perplexity] could hardly have shouted louder for some legal attention. We are not just talking about taking a legal risk, we are talking about intentional copyright infringement.”
Short-term gain, long-term variability
While deals with AI companies may seem tempting for cash-strapped publishers, they can cause long-term damage to the brand, said Shah, strengthening AI models and undermining publishers.
“The money is good [for publishers] for the next five to 10 years, depending on the deal,” he said. “But what happens after that? That’s the unknown part of the story.”
https://www.adweek.com/media/publishers-perplexity-ai-bots-circumventing-blocks/
