OpenAI: ChatGPT Can Now See, Hear, and Speak
OpenAI has recently unveiled ChatGPT's latest feature on Monday, September 25th. This marks the introduction of new voice and image capabilities within ChatGPT, presenting a fresh and intuitive interface that enables users to engage in voice conversations or seamlessly share images with ChatGPT, as detailed in their official announcement
How to activate the voice function in ChatGPT
To activate the voice function, users should navigate to Settings → New Features on the mobile app and opt into voice conversations. Next, tap the headphone icon located in the top-right corner of the home screen and select your preferred voice from a choice of five different voices.
OpenAI also claims that the new voice capability is powered by a new text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech. The company collaborated with professional voice actors to create each of these voices and employed Whisper, their open-source speech recognition system, to transcribe users' spoken words into text.
Related: Introducing ChatGPT Enterprise, OpenAI's Biggest Announcement Since the Launch of ChatGPT
How to show images to ChatGPT
To chat about images to ChatGPT, simply tap the photo button to capture or select an image. If you're using iOS or Android, begin by tapping the plus button. You can also discuss multiple images or use the drawing tool to guide your assistant.
ChatGPT's image understanding is driven by multimodal GPT-3.5 and GPT-4 models, applying their language reasoning skills to a wide range of visual content, including photographs, screenshots, and documents containing both text and images.
Why are voice and image capabilities not available in my ChatGPT?
If you're wondering why voice and image capabilities are not available in your ChatGPT, OpenAI has stated that they are deploying these features gradually to ensure the safety and beneficial use of artificial general intelligence (AGI). This gradual rollout enables OpenAI to make continuous improvements and refine risk mitigation strategies while also preparing users for more powerful systems in the future.
For more information, you can read in OpenAI's blog post on this topic by following this link.
Related: Sam Altman on ChatGPT and a Generative AI fueled Future