Meta stopped plans to train AI with data from European users because of pressure from regulators
Meta has shelved its plans to use public content from users in the UK and EU to train AI because the Irish Data Protection Commission and the UK’s Information Commissioner’s Office raised privacy concerns.
Meta has canceled its plans to train its AI systems with UK and EU user data. The Irish Data Protection Commission (DPC), which is Meta’s main regulator in the EU and speaks for several data protection authorities across the bloc, pushed back against the move.
The Information Commissioner’s Office (ICO) in the UK also asked Meta to put its plans on hold until it could address the concerns it had raised.
“The DPC is glad that Meta has decided to stop its plans to train its large language model using public content shared by adults on Facebook and Instagram across the EU/EEA,” the DPC said in a statement Friday.
The DPC and Meta talked about this choice for a long time. The DPC, along with other EU data protection authorities, will keep talking to Meta about this matter.
In some markets, like the U.S., Meta already uses user-generated content to train its AI. However, Europe’s strict GDPR rules have made it harder for Meta and other companies that want to use user-generated content to train their AI systems, like large language models.
Last month, Meta started telling users that it was going to change its privacy policy. The new policy will allow Meta to use public content on Facebook and Instagram to train its AI.
This includes comments, interactions with businesses, status updates, photos, and the captions that go with them. The business said it had to do this to show “the variety of languages, geography, and cultural references of the people in Europe.”
The planned date for these changes to happen is June 26, 2024, which is 12 days from now. But because of the plans, the non-profit privacy activist group NOYB (which stands for “none of your business”) filed 11 complaints with EU member states, saying that Meta is breaking different parts of GDPR.
One of them deals with opting in or opting out. Before processing personal data, users should be asked if they agree, not forced to decide.
To defend its actions, Meta pointed to a part of the GDRP known as “legitimate interests” to say that they were legal. Meta has used this legal basis before to defend itself, most recently to show that it was okay to use European users’ data for targeted ads.
It was always likely that regulators would at least put Meta’s planned changes on hold, especially since the company had made it difficult for users to “opt-out” of having their data used. The company said it sent more than 2 billion notifications to users about the upcoming changes.
These notifications didn’t show up at the top of users’ feeds like other important public messages, like reminders to go vote, but instead appeared next to users’ normal notifications, like birthdays of friends, photo tag alerts, group announcements, and more. It is very easy to miss this if someone doesn’t check their notifications often, it’s very easy to miss this.
People who did see the notification wouldn’t know that they could object or opt out right away because it only told them to click through to see how Meta would use their information. Nothing about this made it look like there was a choice.
Users also technically couldn’t “opt-out” of having their information used. Instead, they were required to complete an objection form outlining their reasons for not allowing the processing of their data.
Only Meta had the authority to decide whether to honor this request, despite the company’s promise to do so for all requests.
People who wanted to find the objection form in their account settings had a difficult time, even though the link to it was in the notification itself.
First, they had to click on their profile picture in the upper-right corner of Facebook’s website. Go to settings, then privacy, and tap the privacy center. Next, they had to scroll down and click on the Generative AI Meta section.
Eventually, they navigated past numerous links to locate the “more resources” section. They had to read about 1,100 words before they found a clear link to the company’s “right to object” form.
The first link in this section pertains to “How Meta uses information for Generative AI Models.” The same thing happened in the Facebook app for phones, too.
Meta’s policy communications manager, Matt Pollard, pointed to an old blog post this week when asked why users had to file an objection instead of opting in.
The post says, “We believe this legal basis [“legitimate interests”] is the most appropriate balance for processing public data at the scale needed to train AI models while respecting people’s rights.”
In other words, making this an opt-in probably wouldn’t get enough people to be willing to give their data. To get around this, it was best to send users a single notification along with their other notifications, hide the objection form behind six clicks for those who wanted to “opt out” on their own, and then make them explain their objection instead of giving them a straight opt-out.
Stefano Fratta, Meta’s global engagement director for privacy policy, said in a new blog post today that the company was “disappointed” by the DPC’s request.
“This is a setback for European competition and innovation in AI development, and it will take even longer for people in Europe to get the benefits of AI,” Fratta wrote. “We are still very sure that our approach is in line with European rules and laws.” We don’t offer AI training as an exclusive service, and we’re more open than a lot of our competitors in the same field.
Meta’s AI ambitions highlight big tech’s data Dilemma
We’ve seen this before, and Meta is part of an AI arms race that has revealed Big Tech’s massive data collection.
Reddit announced earlier this year that it has a deal to make more than $200 million over the next few years by licensing its data to companies like OpenAI, the company that makes ChatGPT, and Google.
The second company is already facing significant fines for using copyrighted news content to train its generative AI models.
But these efforts also show how far companies will go to make sure they can use this data within the limits of current laws—”opting in” is rarely on the table, and the process of opting out is often too complicated for no beneficial reason.
Someone found some questionable language in Slack’s privacy policy earlier this month. It said the company could use user data to train its AI systems, and users could only opt-out by emailing the company.
And finally, last year, Google let online publishers add a piece of code to their sites so that Google would not train its models on those sites. For its part,
OpenAI is making a special tool that will let content creators choose not to train its generative AI. This should be ready by 2025.
Meta has temporarily halted its plans to train its AI on users’ public content in Europe. However, they are likely to come back in a different form, hopefully with a new way for users to give permission.
Today, Stephen Almond, the ICO’s executive director for regulatory risk, said, “It is important for the public to know that their privacy rights will be respected from the start if they want to get the most out of generative AI and the opportunities it brings.”
We will closely monitor Meta and other major generative AI developers to assess their safety measures and ensure the respect of UK users’ information rights.