Meta, the tech behemoth formerly known as Facebook, is currently navigating a crucial crossroads in its aggressive pursuit of generative AI excellence. The company is contemplating the acquisition of superior, immediate training data to refine its AI tools, potentially eyeing the news industry as a source. Internally, Meta’s teams are weighing the benefits of striking new paid agreements with news publishers to secure more comprehensive access to news, photos, and video content. This strategy aims to enhance the efficacy of Meta AI and bolster its standing in the fiercely competitive generative AI market.
While Meta has not formally approached any news outlet about licensing content, the idea is gaining traction internally. Should they proceed, these new agreements would diverge from past arrangements where Meta paid publishers merely to host links on its platforms. The focus now is on acquiring data directly for model training, a significant shift from the company’s previous disengagement from the news sector. Just last year, Meta slashed its $2 billion News division budget, signaling a departure from its erstwhile investment in journalistic content.
Despite the confidence of Meta’s CEO, Mark Zuckerberg, in the company’s existing data troves for training its Llama large language model, there’s growing concern that Meta might lag behind competitors like Google and OpenAI. Zuckerberg has previously touted Meta’s data as larger than Common Crawl, a vast dataset gleaned from web scraping widely used to train AI models. However, if Meta becomes overly reliant on its proprietary data, it risks producing outputs that may not measure up to the cutting-edge standards set by its rivals.
The landscape of web data harvesting has dramatically changed since generative AI technologies like ChatGPT entered the public domain nearly two years ago. In response, many news outlets and websites have started blocking automated bots from Common Crawl and OpenAI, which had been scouring their content for free. With the US Copyright Office now considering new regulations for generative AI, the free ride for tech giants is nearing its end. This shift could leave Meta’s AI responses to user queries about current events less accurate and more outdated.
Unlike Meta, several leading tech companies have already inked deals with news publishers to gain privileged access to content for AI model training. Such arrangements are increasingly seen as essential for maintaining the relevance and accuracy of AI tools. For news publishers, these deals offer a financial lifeline, as illustrated by the sentiment that “something is better than nothing.” This pragmatic viewpoint underscores the news industry’s readiness to engage in mutually beneficial licensing agreements.
In a nutshell, Meta’s next move in the generative AI arena could significantly impact its competitive edge. By potentially tapping into the rich reservoir of news content through paid agreements, Meta may bolster its AI capabilities and regain its footing against formidable competitors. As the company deliberates its strategy, the outcome could redefine the interplay between tech giants and the news industry, setting the stage for a new chapter in AI development.