OpenAI’s latest AI model, GPT-4o, incorporates real-time speech and vision reasoning to enhance user-machine interactions. The free model offers faster response times, improved intelligence, and expanded applications in education and customer service, setting a new benchmark in AI technology.
OpenAI has introduced a new artificial intelligence model called GPT-4o, which features real-time speech and vision reasoning. The announcement was made during a live-stream event hosted by Chief Technology Officer Mira Murati. GPT-4o aims to improve user interaction with machines by enabling natural, seamless conversations through text, audio, and visual prompts. The model demonstrated capabilities such as solving equations via an iPhone camera, offering breathing techniques advice, and responding to verbal and visual cues quickly and accurately.
GPT-4o will be available for free and promises faster response times and improved intelligence over earlier versions. OpenAI also revealed that the model can fulfill tasks involving image translation and real-time video conversations. These enhancements aim to provide a more intuitive user interface and expand the model’s applications in various fields, including education and customer service.
OpenAI competitor Anthropic also expanded its AI assistant Claude to the European Union. Developed with a focus on trustworthiness and safety, Claude is available on both iOS and web platforms. The model uses a method called “constitutional AI” to adhere to a prescribed set of values.
The announcement of GPT-4o comes just before Google’s annual developer conference, where similar advancements in AI are expected to be revealed. Google is anticipated to showcase updates to its AI model Gemini, which also features multi-modal capabilities.