This Week in AI: OpenAI Teams Up with Publishers for mutual benefit

OpenAI’s decision to publish more AI columns reveals a big content licensing deal with News Corp. This makes me wonder how long these kinds of partnerships can last when AI is getting smarter and less dependent on large amounts of training data.

It’s hard to keep up with an industry that changes so quickly, like AI. Until AI can do it for you, here is a handy list of recent machine learning stories, along with interesting research and experiments that we didn’t talk about on our own. For now, we’re increasing the frequency of our semi-regular AI column from about twice a month to once a week, so keep an eye out for more editions.

This week in AI, OpenAI said it had made a deal with News Corp, the new publishing giant, to use articles from News Corp brands to train generative AI models that OpenAI had made. The companies say the deal is “multi-year” and “historic.” It also lets OpenAI show News Corp mastheads in apps like ChatGPT when answering certain questions, likely when the answers come from News Corp publications in whole or in part.

That sounds like a win-win for everyone, right? News Corp reportedly gets more than $250 million for its content at a time when the outlook for the media industry is even worse than usual. Generative AI hasn’t helped; it threatens to drastically cut the amount of referral traffic that publications get. OpenAI, on the other hand, doesn’t have to worry about another expensive court case because it is already fighting copyright holders over fair use issues in several different areas. However, the details are crucial. Keep in mind that all of OpenAI’s content licensing deals have an end date. The News Corp deal is no different.

That isn’t bad faith on OpenAI’s part by itself. In the media, licensing that lasts forever is pretty rare because everyone wants to keep the door open to renegotiating the deal. However, it appears somewhat dubious given that OpenAI CEO Sam Altman recently stated that the importance of AI model training data is diminishing.

Reevaluating AI’s Data Dependency: Implications for Publishers

On the “All-In” podcast, Altman stated that he “definitely does not think there will be an arms race for training data” because “at some point, when models get smart enough, it’s not about getting more data, at least not for training.

James O’Donnell of MIT Technology Review asked him about his “optimism” that OpenAI and/or the AI industry will “find a way out of needing more and more training data.” There it was.

OpenAI is reportedly testing with fake training data and searching the web and YouTube for real data because models aren’t that “smart” yet. Let’s say, though, that one day they don’t need much more data to get a lot better.

What does that mean for publishers, especially since OpenAI has already scraped all of their archives?

To make my point, I want to say that publishers and other content owners that OpenAI has worked with seem to be little more than short-term partners.

Through licensing deals, OpenAI effectively stops a legal threat, at least until the courts decide how fair use works in AI training. They also get to enjoy a PR win. Publishers get the money they need. The development of AI, which has the potential to cause significant harm to these publishers, continues.

InCEO Sam Altman, ChatGPT, generative AI, James O'Donnell, MIT Technology, News Corp, openAI, Publishers

Meta’s AI Growth Powers 2024 Profit Surge, Stock Hits New Highs

OpenAI Warns of China Copying AI, Eyes Stronger U.S. Partnership

Meta Sets Up Four ‘War Rooms’ to Evaluate DeepSeek’s AI Model

Meta AI Uses Past Interactions to Personalize Facebook and Instagram Feeds

OpenAI Unveils ChatGPT Gov for U.S. Agencies, Backed by Microsoft

Meta Gears Up with $65 Billion Investment to Dominate AI

OpenAI Launches ‘Operator’ to Automate Web Tasks

Meta Supports Databricks’ $10B Round to Lead AI and LLM Innovations

OpenAI Faces Legal Challenge in India Over ChatGPT Data Dispute