Leaked Emails Expose Meta’s Massive AI Training Piracy Scandal

Tech giant Meta was accused last month of training its AI systems illegally. The lawsuit spoke about Facebook’s parent firm using pirated content such as articles and ebooks to get the job done.

Now, the latest on this front has to do with unsealed emails providing the latest evidence on this front and how Meta engaged in the shocking act. The case was rolled out by book authors and how the new findings are a new breakthrough in the lawsuit, all thanks to the latest round of leaked communications.

The emails brought to light how Meta did admit to the controversial act and how it torrented a major dataset dubbed LibGen that entails tens of millions of pirated material. As per the authors’ filing, Meta used 81.7 TB of information spread out over several shadow libraries through Anna’s Archive. This includes 35.7TB of information from the Z-library and LibGen. Other than that, the firm says it was previously torrenting 80.6TB of information through the LibGen.

The authors shared more about how the magnitude of the company’s unlawful torrenting scheme was shocking, to say the least. This is a serious criminal offense and warrants an in-depth investigation.

The emails displayed how employees at Meta were well aware of the legal risks attached to the actions. Then in 2023, a leading research engineer at the firm mentioned how torrenting from a company’s laptop did not feel like it was the right step.

In the internal message, engineer Nikolay Bashlykov expressed serious issues linked to using Meta IP addresses for loading pirated material and corporate laptops for the same reason. Then in September of 2023, he went one step ahead to protest more and consult with legal experts on the matter.

Through torrents, seeding files would be possible which means sharing the data with the outside world. And he noted that such acts were legally not acceptable. Despite so many warnings, the authors kept on arguing about how the tech giant knew what it was getting into but still chose to carry on with illegal actions.

They tried to hide the activity by making edits to settings so even a tiny fraction of seeding could take place. Additionally, Meta tried to avoid the risks of anyone being able to track the downloader or seeder involved. They did this by installing the data to servers that were non-Meta-owned.

Image: DIW-Aigen

Read next: Google’s Latest Update Features an Automatic Time Zone Option for Travelers
Previous Post Next Post