How OpenAI Improved Voice Mode Safety and Functionality in ChatGPT

How OpenAI Improved Voice Mode Safety and Functionality in ChatGPT





OpenAI Improves ChatGPT Voice Mode Safety and Functionality

OpenAI Enhances ChatGPT with Advanced Voice Mode

OpenAI has begun rolling out a refined voice mode for ChatGPT, specifically for a limited group of subscribers to its Plus plan. The update, which was first previewed during the GPT-4 launch event in May, faced initial backlash for its resemblance to Scarlett Johansson’s voice in the movie Her. After months of safety evaluations and functionality improvements, OpenAI is now confident in the new and safer version of its voice mode.

The Scarlett Johansson Controversy

The initial voice mode, named “Sky,” created a stir because of its uncanny similarity to Scarlett Johansson’s AI character in Her. This resemblance not only sparked public concern but also prompted legal inquiries from the actress herself. OpenAI halted the feature’s rollout to address these issues rigorously.

Robust Safety Measures and Ethical Considerations

OpenAI took multiple steps to ensure the new voice mode would not repeat past mistakes. Here’s what they did:

  • Stress Testing with External Experts: Over 100 external experts, referred to as “red teamers,” were brought in to identify potential weaknesses and biases in the system.
  • Content Filters: New filters have been implemented to block requests for generating copyrighted audio, such as music, and to detect and refuse potentially harmful content.
  • Voice Actor Collaboration: The new voice mode avoids mimicking real people by using four preset voices created with the help of professional voice actors.

These measures underscore OpenAI’s commitment to ethical AI development and safety.

More Interactive and Natural Conversations

The revamped voice mode, demonstrated at the GPT-4 event, allows for more natural and interactive conversations. Users can interrupt ChatGPT mid-sentence and request changes in the way stories are told, showcasing dynamic engagement. This feature aims to provide a smoother, more responsive user experience.

Gradual Rollout to Plus Subscribers

Initially available to a select group of subscribers, OpenAI plans to extend the enhanced voice mode to all ChatGPT Plus users by this fall. Users selected for this alpha will receive instructions via email and a message in their mobile app. The company will continue to add more users on a rolling basis.

Here’s what OpenAI said in an announcement: “Users in this alpha will receive an email with instructions and a message in their mobile app. We’ll continue to add more people on a rolling basis and plan for everyone on Plus to have access in the fall. As previously mentioned, video and screen sharing capabilities will launch at a later date.”

Ensuring Privacy and Safety

To ensure privacy, OpenAI trained the model to use only the four preset voices and developed systems to block outputs deviating from these voice options. Additionally, measures have been implemented to prevent requests for violent or copyrighted content.

Such efforts showcase OpenAI’s focus on maintaining high ethical standards and prioritizing safety.

What Users Can Expect

Subscribers who are part of the initial rollout can expect the following features:

  • Real-Time Responses: ChatGPT responds immediately, allowing for smooth, uninterrupted interactions.
  • Emotion-Sensitive Replies: The AI can sense and respond to a user’s emotional tones, making conversations more relatable and engaging.

These improvements aim to make AI interactions as close to human conversations as possible while keeping user safety at the forefront.

Looking Ahead

The rollout timeline is set between late September and December, with the full availability to all Plus subscribers anticipated by the end of 2024. OpenAI’s enhancements to both the safety and functionality of ChatGPT’s voice mode indicate a balanced approach to cutting-edge technology and responsible AI use.

Read More: TechRadar


also read:How does OpenAIs ChatGPTs GPT-4o Advanced Voice Mode sense emotional intonations in a users voice?

Related Post

Leave a Reply

Your email address will not be published. Required fields are marked *