Reddit Takes Anthropic to Court—AI Firm Accused of Rogue Data Scraping
Reddit just dropped the legal hammer—filing suit against Anthropic for allegedly vacuuming up user data without permission. The social media giant claims the AI startup bypassed ethical (and legal) boundaries to train its models. No dollar figures disclosed yet, but you can bet the damages demand will be juicier than a VC's Series A pitch.
Data privacy meets AI ambition—and this time, it's personal. Reddit's move signals a growing backlash against unchecked scraping, even as Silicon Valley keeps pretending 'fair use' covers everything short of armed robbery. Meanwhile, Anthropic's lawyers are probably scrambling faster than a crypto trader after a 20% dip.
Ongoing issues
Several lawsuits have already been filed by organizations, including a high-profile case brought by The New York Times against OpenAI and Microsoft in 2023. Other plaintiffs include visual artists, authors, and record labels who argue their work was exploited without permission.
Anthropic is also facing another lawsuit regarding its alleged use of copyrighted song lyrics, as well as yet another from a group of authors who said the company used pirated versions of their books as training materials.
The tension has spilled into the cultural arena, with artists expressing outrage over AI-generated imitations of their styles.
Earlier this year, a craze for replicating the art style of the popular Japanese animation company Studio Ghibli sparked concerns about copyright violations and artists losing out to AI programs trained on their own work.
In a submission to the UK Parliament last year, OpenAI acknowledged using copyrighted content in training, arguing it WOULD be "impossible" to develop leading AI systems without it. The company maintains that such practices are lawful.
A proposal last month in the UK to ease copyright law and allow the use of copyrighted materials for training LLMs has come under fire from prominent artists, including Elton John.
Despite its protestations about protecting its users, Reddit itself, however, sees little wrong with using user content for LLM training, as long as Reddit is compensated.
It has struck its own licensing deals with firms like OpenAI, Google, Sprinklr, and Cision to allow access to its content for training purposes.
Edited by Sebastian Sinclair