OpenAI said it plans to discontinue using one of its ChatGPT voices after Scarlett Johansson alleged it sounded “eerily similar” to her own.
On Monday, the company said it is “working to pause” the voice known as Sky, one of five voice users can speak with when using ChapGPT.
Ms. Johansson, who voiced a fictional AI assistant in the 2013 film “Her,” was among those raising questions about how it selects its lifelike audio options on its flagship artificial intelligence chatbot.
The 39-year-old actress issued a statement claiming she had been approached last year by OpenAI CEO Sam Altman, who asked if she would like to lend her voice to the ChatGPT.
“Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system,” she said in a statement provided to The Epoch Times by her publicist, Marcel Pariseau. “He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and AI. He said he felt that my voice would be comforting to people.”
Ms. Johansson said after considering the offer, she decided for “personal reasons” to decline.
She also alleges that just days before the ChatGPT 4.0 demo was released, Mr. Altman reached out to her agent, asking if she would reconsider his offer.
“Before we could connect, the system was out there,” Ms. Johansson continued.
But OpenAI said it could not reveal the name of the voice talent due to privacy issues. The company did not immediately respond to The Epoch Times’ request for more information.
The San Francisco-based company launched its voice capabilities for ChatGPT in September 2023. The roll-out, which included five different voices, allowed users to engage in back-and-forth conversation with the AI assistant.
Previously, “Voice Mode” was only available to paid subscribers, but in November, the company announced it would become a free feature for mobile app users.
ChatGPT interactions are becoming increasingly sophisticated.
OpenAI said the latest update to its generative AI model can mimic human cadences in its verbal responses and can try to detect users’ moods.
The latest model, GPT-4o, short for “omni,” is reportedly faster than earlier versions and can reason across text, audio, and video in real-time, the company says.
During a demonstration on May 13, the AI bot chatted in real time and added “more drama” to its voice as requested. It also tried to determine a person’s emotional state by looking at a selfie video of their face, aided in language translations, step-by-step math problems, and more.
GPT-4o is not widely available, yet, but it will be released to select users in the coming weeks and months. The system’s text and image capabilities have begun rolling out and are expected to reach some who use ChatGPT’s free version. The new voice mode will only be available to paid ChatGPT Plus subscribers.
Proposed Legislation
Rep. Don Beyer (D-Va.) urged the House on Tuesday to consider his legislation following Ms. Johansson’s questions about the voice used by OpenAI.The AI Foundational Model Transparency Act would establish “transparency standards for information that high-impact foundation models must provide to the FTC [Federal Trade Commission] and the public, including how those AI models are trained and information about the source of data used.”