Meta, formerly known as Facebook, is at a fascinating crossroads as it delves deeper into the world of generative AI. The company is pondering a strategy that could see it shell out some serious cash for premium, timely training data, specifically targeting the news industry. Those in the know within Meta are tossing around the idea of inking new paid agreements with news publishers to gain deeper access to their treasure trove of articles, photos, and videos. This move could prove pivotal in sharpening Meta’s generative AI tools, such as Meta AI, making them more user-friendly and competitive in a space where rivals like Google and OpenAI are already flexing their muscles.
The discussions within Meta are purely internal at this stage, with no formal overtures made to any news outlets about licensing or accessing content. However, if Meta decides to proceed, these agreements would be distinct from previous deals where the company paid publishers merely to host links to their content. Meta’s recent history with the news industry has been anything but smooth. Over the past 18 months, the company has dramatically scaled back its involvement, even going so far as to slash a $2 billion budget for its News division last year.
Meta’s CEO, Mark Zuckerberg, has made bold claims about the company’s self-sufficiency in data. He mentioned earlier this year that Meta has amassed a dataset larger than Common Crawl, a comprehensive collection of web-scraped data widely used for AI training. Yet, despite this assertion, Meta could find itself lagging behind competitors if it continues to lean heavily on its own data pool. The landscape has changed significantly since generative AI made its grand entrance with the launch of ChatGPT nearly two years ago. In response, news outlets and other websites have started blocking automated bots like those deployed by Common Crawl and OpenAI from scraping their content for free.
Adding another layer of complexity, the US Copyright Office is contemplating new rules to govern generative AI. Without the constant influx of free content from news publishers, Meta AI’s responses to user queries about current events could suffer in terms of accuracy, timeliness, and reliability. In stark contrast, Meta’s competitors in the generative AI arena have already forged deals with news publishers and media outlets to secure content that enhances their model training.
Interestingly, many news publishers appear open to the idea of licensing deals with tech giants, likely driven by the notion that “something is better than nothing.” For Meta, entering into such agreements could be a game-changer, providing the high-quality training data needed to elevate Meta AI’s performance. This would not only keep Meta competitive but also ensure that its generative AI tools remain relevant and effective.
In summary, while Meta’s internal discussions are ongoing, the potential shift toward paid content deals with news publishers could mark a significant turning point. Whether or not Meta decides to forge ahead, the outcome will profoundly impact its position in the generative AI market and its relationship with the news industry.