ChatGPT can speak, listen, and see images now


The generative artificial intelligence (AI) space continues to heat up as OpenAI introduced GPT-4V, a vision-enabled model with multi-modal conversation modes for its ChatGPT system.

With the new updates, announced on September 25, ChatGPT users will be able to participate in conversations with ChatGPT. The models powering ChatGPT, GPT-3.5, and GPT-4 can now understand queries spoken in plain language and respond in one of five different voices.

According to an OpenAI blog post, this new multimodal interface will allow users interact with ChatGPT in new ways:

โ€œTake a photo of a landmark while traveling and have a live conversation about what's interesting about it. When you're home, take photos of your refrigerator and pantry to see what's for dinner (and ask follow-up questions for a step-by-step recipe). After dinner, help her son with a math problem by taking a photo of him, circling the set of problems, and asking him to share clues with both of you.โ€

The enhanced version of ChatGPT will roll out to Plus and Enterprise users on mobile platforms in the next two weeks, with tracking access for developers and other users "soon after."

ChatGPT's multimodal update comes on the heels of the release of DALL-E 3, OpenAI's most advanced imaging system.

According to OpenAI, DALL-E 3 also integrate natural language processing. This allows users to talk to the model to fine-tune the results and integrate ChatGPT for help creating image messages.

In other AI news, OpenAI competitor Anthropic announced a partnership with Amazon on September 25. As Cointelegraph reported, Amazon will invest up to 4 billion dollars to include cloud services and hardware access. In return, Anthropic says it will provide enhanced support for Amazon's Bedrock Fundamental AI Model along with โ€œsecure model customization and fit for businesses.โ€

Related: Coinbase CEO warns against AI regulation, calls for decentralization