Reddit Sues Anthropic Over Unauthorized AI Data Use

Artificial intelligence company Anthropic, known for positioning itself as the "white knight" of the AI industry, faces a major lawsuit from Reddit. The complaint alleges widespread unauthorized use of Reddit's valuable site data, directly contradicting Anthropic's public claims of ethical AI development and data respect.

This legal battle highlights critical issues surrounding data privacy, intellectual property, and the rapidly evolving landscape of AI training.

Anthropic's White Knight Image Challenged

Anthropic has consistently marketed itself as prioritizing honesty, high trust, and stringent safety protocols, often delaying model releases for extensive testing. This public stance aims to establish it as the most secure and ethical AI company.

However, Reddit's lawsuit presents a stark contrast, painting a picture of corporate cognitive dissonance where Anthropic's actions do not align with its claimed values.

Key Allegations In The Anthropic Lawsuit

Filed in the Superior Court of California, County of San Francisco, the lawsuit from Reddit Inc. against Anthropic PBC includes several serious allegations:

- Breach of contract
- Unjust enrichment
- Trespass to chattels (trespassing on personal property)
- Tortious interference (interference with contractual relations)
- Unfair competition

Reddit is also demanding a jury trial.

Anthropic's Empty Promises Regarding Data Use

Reddit's complaint directly refutes Anthropic's public statements regarding its data practices:

- Personal Data: Anthropic claims it doesn't intend to train models on personal data. Reddit alleges Anthropic intentionally trained on Reddit user data without consent.
‍
- Robots.txt Directives: Anthropic claims it honors industry-standard robots.txt files, which tell bots whether they can crawl a site. Reddit asserts Anthropic ignored these directives, with numerous websites also denouncing this behavior.

- Privacy: Anthropic states its AI chooses responses respectful of privacy. Reddit claims Anthropic, unlike competitors, refused to respect basic privacy rights, including removing deleted posts from its systems.

- Blocking Bots: In July 2024, Anthropic claimed to have blocked its bots from Reddit. Reddit's audit logs show Anthropic bots continued to access Reddit content over 100,000 times in subsequent months.

The Value Of Reddit's Data For AI Training

Reddit.com is recognized as one of the most robust and valuable human-created online discussion platforms globally, making its vast corpus of public content highly desirable for training large language models.

- Early Use: As early as December 2021, Anthropic was allegedly training its Claude model on Reddit user posts without authorization, violating Reddit's user agreement.
- Research Confirmation: Anthropic researchers, including CEO Dario Amodei, acknowledged that training AI models on "large public preferences modeling data sourced from e.g. Reddit comments significantly improve sample efficiency."
- Claude's Admission: Even Anthropic's AI, Claude, confirms it was "trained on at least some Reddit data as part of my broader training set." While the reliability of AI admissions in court is debatable, it underscores the central claim.

Economic Harm And Reddit's Demands

Reddit argues that Anthropic's unauthorized commercial use of its content causes direct economic harm. Reddit has established a market for licensing its content, exemplified by formal partnerships with companies like OpenAI and Google, who pay for access under specific terms that protect users' interests and privacy.

Anthropic's alleged scraping circumvents this market, potentially diverting users from Reddit to AI models for information derived from its platform.

A key issue highlighted is Anthropic's apparent inability to remove deleted user content from its trained models. Once data is incorporated, it's virtually impossible to "un-train" specific pieces of content without continuous, resource-intensive model retraining.
‍

Reddit seeks:

Specific performance
Compensatory damages
Consequential damages, lost profits, or disgorgement of Anthropic's profits
An injunction prohibiting Anthropic from further using Reddit data
Restitution for unjust enrichment
Punitive damages
Attorneys' fees

The lawsuit brought by Reddit against Anthropic presents a crucial test for the rapidly evolving AI industry, challenging the integrity of companies that claim to uphold ethical standards while allegedly engaging in unauthorized data practices.

The outcome will likely shape future norms around AI training data acquisition and intellectual property rights.

As an AI educator and entrepreneur, Claudia-Ibet, co-founder of Promptus, believes the future of generative AI should be transparent, powerful, and accessible to everyone.

My work at Promptus, a leading platform for AI-powered image and video generation, focuses on bridging the gap between advanced AI capabilities and everyday usability, ensuring that creators can wield these tools with clarity, purpose, and creative freedom.