Empowerment and Inclusion advocate Marilena De Costa on living and thriving with Multiple Sclerosis
lifestyle3 days ago
OpenAI has unveiled its latest version of the ChatGPT bot, marking a significant advancement in the field of conversational artificial intelligence.
On Tuesday, OpenAI rolled out an advanced voice mode for ChatGPT, offering users their first experience with GPT-4o’s hyperrealistic audio capabilities. Initially, the enhanced version will be accessible to a limited group of ChatGPT Plus users, with a subscription priced at $20 (Dh74 approx.) per month.
However, they plan to extend this feature to all premium users gradually from September to November.
The new release promises enhanced capabilities, increased accuracy, and a more human-like interaction experience, with the latest enhancement set to transform the way users interact with AI, through real-time, voice-driven conversations.
OpenAI's use of hyperrealistic voice synthesis means that ChatGPT can produce speech that closely mimics human intonation, rhythm, and emotion. Users will find the AI's voice interactions to be engaging and intuitive, with responses that sound remarkably human. This development marks a significant step forward in making AI more accessible and user-friendly.
You might already be familiar with the Voice Mode currently available in ChatGPT, but OpenAI's new Advanced Voice Mode offers a notable upgrade.
A significant focus of this release is on making interactions with the ChatGPT bot feel more natural and human-like. OpenAI has worked on refining the conversational tone of the bot, making it capable of understanding and replicating various styles of communication. Whether the user prefers a formal tone for business interactions or a casual, friendly chat, the new voice mode will be able to adapt accordingly.
Previously, ChatGPT relied on three separate models for its voice feature: one to transcribe your voice to text, GPT-4 to process the input, and another to convert the text back into speech. In contrast, GPT-4o will be built on a multimodal system that handles all these tasks internally, resulting in significantly lower latency during conversations. This will lead to a much quicker response rate, bringing it closer to real-life human interaction.
Additionally, OpenAI asserts that GPT-4o can also detect emotional intonations in your voice, such as sadness, excitement, or even singing.
Initially announced in May, the new voice feature has launched a month later than planned. OpenAI delayed the release to enhance safety measures, ensuring the model can effectively detect and reject inappropriate content.
As with any AI advancement, the introduction of voice capabilities brings ethical considerations and security challenges. OpenAI says it has implemented safeguards to prevent misuse of the voice feature, which include measures to detect and mitigate inappropriate content, as well as systems to ensure that voice data is handled securely and privately.
“We tested GPT-4o's voice capabilities with over 100 external red teamers across 45 languages,” OpenAI announced on X. “To protect people's privacy, we’ve trained the model to only use the four preset voices and developed systems to block any outputs that deviate from those voices. Additionally, we’ve implemented guardrails to prevent requests for violent or copyrighted content.”
In an effort to prevent the model from being misused for creating audio deepfakes, which has become a significant threat to the information economy in recent times, OpenAI has developed four preset voices in collaboration with voice actors. The advanced voice options are designed in a way that avoids impersonating other individuals.
When OpenAI first demonstrated GPT-4o's voice capabilities in May, the voice named Sky drew significant criticism for its close resemblance to that of actress Scarlett Johansson. The actress publicly stated that OpenAI had sought her permission to use her voice, which she had declined. Upon hearing the similarity in the model's demo, she engaged legal counsel to protect her rights.
OpenAI is also committed to transparency and user consent. Users are informed when interacting with AI-generated voices, ensuring that they are aware of when they are communicating with an artificial entity.
However, challenges remain. The potential for misuse of conversational AI, such as generating misleading or harmful information, requires continuous monitoring and improvement of the technology.
As with any new feature, once the advanced voice mode is rolled out at a significant level and feedback from users is gathered in real time, one will be able to gauge potential pitfalls in safety and security.
ALSO READ:
Empowerment and Inclusion advocate Marilena De Costa on living and thriving with Multiple Sclerosis
lifestyle3 days ago
A sneak peek into the exciting motoring moments and events in the UAE
lifestyle3 days ago
Alleno, known for 'modern cuisine', on his go-to ingredient choices for a quick meal
lifestyle3 days ago
It is concealed behind a secret entrance in Westin Mina Seyahi Beach Resort
lifestyle3 days ago
Ahead of the KT Events' DXB F&B Awards on November 27, jury member and chef and podcaster James Knight-Paccheco talks at length about his culinary journey
lifestyle4 days ago
Pet Corner has the largest e-commerce platform with over 10,000 pet products
lifestyle4 days ago
International Booker Prize-winning Georgi Gospodinov was in town for the Sharjah International Book Fair
lifestyle1 week ago
When in doubt about where to begin, consider ADEPT
lifestyle1 week ago