OpenAI Accused of Training AI with Stolen Web Content: Balaji Speaks
Balaji, who used to work for OpenAI, revealed that the company was training its AI with stolen web content, which led to lawsuits over questionable data practices.
Balaji, who left OpenAI in August, recently disclosed that the company trained its chatbot on stolen content from the Internet. Multiple lawsuits have targeted the prominent artificial intelligence (AI) figure for its data collection practices.
Balaji was a very talented Indian American who grew up in Cupertino, California. He did very well in programming contests, coming in 31st at the ACM ICPC 2018 World Finals and first at the Pacific Northwest Regional and Berkeley Programming Contests in 2017.
Like many people in his field, Balaji has been interested in AI since he was young. In an interview with The New York Times in October, he said that he became interested in AI after reading a news story about it as a teenager. He thought that neural networks might be able to solve the world’s biggest problems.
“I believed AI could solve unsolvable problems, such as curing diseases and halting the aging process.” According to the NYT report, he stated, “I believed we could create a type of scientist who could assist us in solving these problems.” He joined a group of Berkeley graduates who worked for OpenAI in 2020.
Balaji’s Role in Training ChatGPT
They employed Balaji for four years. For one and a half of those years, his primary responsibility involved gathering and organizing the vast amounts of internet data used to train ChatGPT, the company’s chatbot.
In the interview with the New York Times, Balaji said that when he first started working at OpenAI, he didn’t really think about whether it was legal to use both copyrighted and open internet data to make the company’s products.
After ChatGPT came out in late 2022, he started to think about the morality of these actions. He saw that tools like ChatGPT were hurting the internet by using protected content without permission.
He decided in 2024 that “he no longer wanted to contribute to technologies that he believed would bring society more harm than benefit.” He quit OpenAI in August but didn’t find another job right away. Instead, he worked on “personal projects.”
A court document named him as an individual whose records OpenAI would review in a lawsuit against the company. He died the next day.