Race for AI Data Leads to Ethical Concerns: Tech Giants Scrutinized for Copyright Violations

Meta Is Seeking Additional Data Sources to Improve AI Training

As the race for artificial intelligence (AI) intensifies, tech giants are scrambling to find new sources of data to fuel their systems. At Meta, executives have been meeting almost daily to develop strategies for gathering data. However, as AI systems become more powerful, companies are becoming more aggressive in their pursuit of data, which could potentially lead to copyright violations.

One example of this is the use of YouTube by OpenAI to train its video generator, Sora. While suspicions have arisen that OpenAI may have used copyrighted material from YouTube without permission, the company’s CTO has denied these accusations.

During meetings at Meta, several solutions were considered for obtaining new data sources. One possible solution was to purchase the publishing house Simon & Schuster, but others suggested paying $10 per book to obtain licensing rights to new titles. By the time of the meetings, Meta had already summarized many books, essays and other online works some of which contained copyrighted information. When the issue of ethical concerns was raised during the meetings, there was silence from attendees. Meta did not immediately respond to requests for comments from Business Insider.

Ultimately, executives at Meta decided to rely on the precedent set by the Supreme Court case Authors Guild vs Google in 2015. The court ruled in favor of Google allowing them to digitize books under fair use guidelines. Meta’s lawyers argued that the company could train its AI systems under similar guidelines

Leave a Reply