Cloudflare’s Decision to Block AI Crawlers Could Affect Your Performance in LLMs

20250710 -- How Cloudflare's Decision to Block AI Bots Could Affect Your Performance (info) -- Jill

As a popular content delivery network (CDN), Cloudflare acts as a gateway to about 20% of the internet, enhancing performance and improving security. But soon, Cloudflare will be doing more: blocking AI crawlers for large language models (LLMs) like ChatGPT and Perplexity by default from crawling sites.

Cloudflare has proposed a “Pay per Crawl” initiative that would offer compensation for the information LLMs gather to use in their generative answers. As of today, the program is in private beta, with sites needing to opt in. But at some as-yet-undetermined point, Cloudflare will begin to block AI crawlers by default, essentially forcing the Pay per Crawl model on approximately 20% of the internet. 

After the initiative goes live, those who don’t know about Pay per Crawl will not have the choice to decide whether they want to appear in LLMs’ generative answers or not. They just won’t appear, won’t be aware that they’re not appearing, and won’t know why, even if they are aware.

Why Is Cloudflare Blocking AI Crawlers?

Many site owners are frustrated with the current model in which LLMs index content from their sites to feed the LLMs’ generative results. Cloudflare’s initiative offers a Pay per Crawl model that forces LLMs to decide whether your content is worth paying for.

Get your free PPC Audit Today!

It’s an intriguing proposition for publishers, certainly, but it begs the question of each individual site owner: Is your content worth paying for? Or do you get more out of the relationship — out of the brand awareness generated by inclusion in LLMs’ generative answers — than the LLMs do from including your content? Can the AI crawlers get similar or better information from other sources that they don’t have to pay for? Those are the questions you have to answer. 

The answer is likely to be different for major publishers and large corporations than it is for small to mid-sized businesses (SMBs) or small bloggers and publishers. As a major news publisher, absolutely, it makes sense — the LLMs need that content to create their generative answers. But for small players? Do LLMs really need you enough to pay for your content? I’m not sure that the answer is yes, so blocking AI crawlers may be more harmful to SMBs and small publishers than allowing them to scrape.

Cloudflare hails this program as a way to make AI fair, to compensate sites for the use of their content. Having the ability to decide is absolutely the right thing to enable. However, planning to block AI crawlers by default is not the right answer to the problem.

How Does Cloudflare’s Pay per Crawl Program Work?

When a crawler matching the user agent string for one of the designated LLMs knocks at Cloudflare’s door to access a site, if that site is part of the Pay per Crawl model, then the blocked bot will receive a 402 HTTP response code. A 402 signals to the user agent (in this case, the LLM bot) that the content is not available unless a payment is made.

Which bots are included in the blocking hasn’t been specified yet in Cloudflare’s documentation, but they do have three categories of AI crawlers specified today: AI Assistant (such as Perplexity-User and DuckAssistBot), AI Crawler (such as Google Bard and ChatGPT bot), and AI Search (such as OAI-SearchBot). Cloudflare’s CEO, Matthew Prince, also mentioned yesterday on social platform X that “Gemini is blocked by default.” It’s also possible that you could specify which categories or which bots to block or allow, but that hasn’t yet been clarified, either.

How Will Traditional SEO Be Impacted?

The Pay per Crawl initiative appears to have no impact on traditional search engines, such as Google, Bing, and others. However, LLM search engines like SearchGPT and Perplexity may be caught up in the Pay per Crawl. And Cloudflare is reportedly working on convincing Google (by negotiation or by law) to separate its search crawler Googlebot into separate crawlers, one for traditional search and one for AI crawling to feed AI Overviews and AI Mode, so that the AI versions could be blocked. 

While the likelihood of success is questionable, search engine optimization (SEO) professionals would applaud the success of splitting Googlebot into search and AI crawlers.

Offering site owners a choice as to whether to allow AI crawlers to scrape their content is the right thing to do. But that choice should be offered, not implemented by default. The face of the internet and how information is accessed is rapidly changing. For many site owners, especially those without the benefit of being the biggest names in the space, not being present in AI could be as dangerous as not being compensated for their content.

About the Author:

EXPLORE OUR BLOGS

Related Posts

Sign up for our mailing list

Get the latest on the world of digital marketing right to your inbox.

    Share This Resource, Choose Your Platform!

    Join the JumpFly Newsletter

    Get Our Marketing Insights Right To Your Inbox

      Schedule a Call

        Fields containing a star (*) are required


        Content from Calendly will be embedded here