Cloudflare will block its trackers on the Internet

Last June brought Two good news for companies that develop generative artificial intelligence tools. Two sentences issued in California, which affect Anthropic in one case already target in another, They framed having fed with books protected by copyright to their artificial intelligence models within the Fair Use or fair use that allows US legislation. That is, they did not have to financially compensate for anyone to do so. However, not everything rows in that direction and now it has been Cloudflareone of the main suppliers of the Internet infrastructure, who has taken a measure that will make the most difficult things to chatbots that generate their responses from the information they find on the network.

Cloudflare will block from now on default to ia web trackers known to prevent ‘They access content without permission or compensation’as announced on Tuesday. With this change, Cloudflare will begin To ask the new owners of websites if they authorize the access of the IA trackersand will allow them to apply a rate of ‘Tracking payment’ (Pay per crawl).

If this is important, it is because Cloudflare is one of the legs on which the Internet is based on providing services such as CDN (which distributes the content by servers worldwide so that websites load faster from anywhere), DNS (Translate the names of the domains into IP addresses that computers understand) and Attack protectionamong others. A good part of the Internet goes through its servers and services.

Pay for tracking

Thus, the Pay Per Crawl program will allow editors to set a price for AI trackers to access their content. The companies of AI will be able to consult the rates and decide whether they are registered to pay the fee or if they give up. For now, this is only available in the beta phase for ‘a group of some of the main editors and content creators’, among which are The Associated Press, The Atlantic, Fortune, Stack Overflow, Quora and others, but Cloudflare ensures that it will ensure that’ the companies of AI can use quality content of the correct form: With permission and compensation‘. Web sites administrators who are interested can sign up for beta here.

Cloudfare wants to stop the voracity of AI

Cloudflare has been helping websiters to defend the IA trackers for some time. One of the sector’s concerns is that, since the AI has exploded and Google has added it to its search engine, have seen how visits to their pages are reducedsince the user already finds what he is looking for in a chatbot as Chatgpt or in the Views created with AI that summarizes the information that users are looking for in Google. Access to information with which they feed the chatbots responses It has been, until now, a open bar for AI companies, but Cloudflare wants that to change.

‘People trust AI in the last six months, which means that they do not read the original content‘, said Cloudflare CEO, Matthew Princeduring the Live Axios event last week.

The company began to Allow websites to block the AI trackers in 2023although this only applied to those who respected the file robots.txt of the site. This is a text file lodged on the server that websites use to indicate to the bots and trackers which parts of their content may or may not explore and index. The company identifies the trackers that you must block by comparing them with its acquaintance bot list.

Last year, Cloudflare allowed the sites They will block ‘all’ the bots of AI, regardless of whether or not they respected the robots.txtand now this configuration is activated by default for new Cloudflare customers.

In addition, Cloudflare launched in March a function that diverts tracker bots towards a ‘Labyrinth of AI’ To discourage to extract content without permission. This system deteries the Scraping (automatically collect websites data) redirecting tracker bots to false links or that do not contain useful information, making them spend time and resources on useless processes.

Cloudflare points out that it is Collaborating with AI companies to help verify their trackers and allow them to ‘clearly declare their purpose’as if they use the content for training, inference or search. The owners of the websites will be able to review this information and decide to what trackers allow access.

‘The original content is what makes the Internet one of the greatest inventions of the last century, and we have to unite to protect it’Prince points out in the press release. ‘IA trackers have been extracted without limits. Our goal is to return power to the creators, while helping AI companies to innovate, ‘he adds.