TechCrunch Minute: Reddit is taking a stand against AI crawlers

June 28, 2024 ndowd

Reddit is taking a stand against AI companies — or at least asking them to pay up.

Earlier this week, Reddit announced that it’s changing its Robots Exclusion Protocol, also known as its robots.txt file. This dry-sounding edit is part of a larger negotiation/battle between the AI companies that are hungry for content they can use to train their language models, and the companies that actually own the content.

“Robots.txt” is how websites communicate to third parties how a website can be crawled — the classic example being websites that allow Google to crawl them so they can be included in search results.

In the case of AI, the value exchange is a lot less obvious. When you run a website whose business model involves attracting clicks and eyeballs, there’s not much appeal in letting AI companies hoover up your content and then they don’t send you any traffic — and in some cases, they outright plagiarize your work.

So by changing its robots.txt file, and also by continuing to rate limit and block unknown bots and crawlers, Reddit seems to be working to prevent the practices that companies like Perplexity AI have been criticized for.

Hit play to learn more, then let us know what you think in the comments!

source