In a pivotal ruling for the generative AI world, a U.S. federal judge declared that Anthropic’s use of copyrighted books to train its Claude models qualifies as fair use. The tech win rattled the creative industry — and it’s far from over. A separate trial over alleged book piracy still hangs in the balance.
How far can “transformative” go before it becomes theft? That’s the uncomfortable question echoing from courtrooms to publishing houses.
Inside the Decision: What the Judge Ruled
Anthropic, the AI startup behind Claude, just scored a landmark legal victory. On Monday, Senior District Judge William Alsup ruled that the company’s training use of copyrighted books met the legal threshold for fair use — a first for any AI company in U.S. courts.
Calling the AI training “exceedingly transformative,” Alsup said Anthropic’s repurposing of written works into machine learning data aligned with the legal boundaries of fair use.
That said, the ruling isn’t a full exoneration. While the use of physical books was deemed lawful, Alsup raised red flags over how some of those materials were acquired.
According to court documents, Anthropic co-founder Ben Mann downloaded more than 7 million copyrighted books from illegal sources, including LibGen and Pirate Library Mirror. These pirated files were then used to inform which physical books the company purchased and destroyed to create a digitized metadata library.
Anthropic’s VP Tom Turvey was allegedly “tasked with obtaining all the books in the world,” prioritizing speed over licensing logistics. The process involved gutting millions of books — literally slicing the bindings off — to scan them more efficiently.
Alsup drew a clear line: while scanning books for model training qualified as fair use, building a permanent digital catalog of pirated material did not. The judge ordered a separate trial to address the piracy claims specifically.
The ruling landed just days before Meta partially won its own copyright fair use case, signaling growing legal momentum for AI companies — and deepening anxiety for creators.
Meanwhile, a wave of backlash is building from authors and artists.
On Friday, a group of bestselling novelists including Emily Henry, R.F. Kuang, and Colleen Hoover issued an open letter urging publishers to reject AI-generated content. They called out what they described as theft of creative labor to train models that might one day replace them.
“Our stories were stolen,” the letter reads, “to build machines that could soon write the books we used to.”
The Bigger Picture: Who Gains, Who Loses
This ruling could become a reference point in nearly every copyright case involving AI training. If upheld, it strengthens the position of companies like OpenAI, Meta, and Midjourney — all of which face similar lawsuits.
For the AI sector, it means fewer barriers to scaling foundational models using massive, diverse datasets. But for authors, filmmakers, educators, and journalists, it risks stripping away control over how their work is used.
The tension boils down to a familiar digital dilemma: scale versus consent. What’s efficient for tech may be exploitative for creators.
And the financial impact is no footnote. Without licensing deals, content owners lose out on royalties. One creator’s “transformative” is another’s unpaid labor.
Expert Commentary
“The court is rewarding a company for stealing millions of books and calling it innovation,” author Victoria Aveyard said in a statement following the ruling. “If this stands, we’re handing the keys of the creative industry to machines built on theft.”
The open letter also noted: “Rather than paying writers a small percentage of the money our work makes for them, someone else will be paid for a technology built on our unpaid labor.”
GazeOn’s Take: Where It Could Go From Here
This ruling could set off a domino effect for dozens of pending AI copyright cases. But Anthropic’s partial win isn’t a clean slate. The piracy trial still poses a major reputational and financial risk.
And if more judges lean toward fair use for training, the pressure will mount on lawmakers to redefine intellectual property rights in the AI age.
One thing’s clear: the courts are becoming the new front line for how AI learns, and who pays the price.
Should AI training be protected by fair use — even if the data was pirated? Tell us where you stand.
