Introducing the GPT-4o-Mini Audio Models: Adding More Choice to Audio-Enhanced AI Interaction

Windows Server · February 5

We are thrilled to announce the release of the new GPT-4o-Mini-Realtime-Preview and GPT-4o-Mini-Audio-Preview models, both now available in preview. These new models introduce advanced audio capabilities at just 25% of the cost of GPT-4o audio models. Adding on to the existing GPT-4o audio models, this expansion enhances the potential for AI applications in text and voice-based interactions. Starting today, developers can unlock immersive, voice-driven experiences by harnessing the advanced capabilities of all Azure OpenAI Service advanced audio models, now in public preview.

Key Benefits

Advanced Audio Capabilities: Enjoy high-quality audio interactions at a fraction of the cost of GPT-4o audio models.
Seamless Compatibility: Our new models are compatible with existing Realtime API and Chat Completion API, ensuring smooth integration and consistent functionality across model families.
Innovative Interactions: Experience natural and intuitive interactions with our voice-based capabilities, making your interactions more engaging and effective.

Detailed Features

GPT-4o-Mini-Realtime-Preview:

Real-Time Voice Interaction: Enable real-time, natural voice-based interactions for a more engaging user experience.
When to Use: Ideal for applications requiring immediate, real-time responses, such as customer service chatbots and virtual assistants.

GPT-4o-Mini-Audio Preview:

Advanced Audio Capabilities: Provides high-quality audio interactions at a reduced cost.
When to Use: Perfect for applications requiring asynchronous audio capabilities, such as recording sentiment analysis and text-to-audio content creation.

Real-World Applications

The potential of our new products spans across various industries, transforming how businesses operate and how users interact with technology:

Customer Service: Voice-based chatbots and virtual assistants can now handle customer inquiries more naturally and efficiently, reducing wait times and improving overall satisfaction.
Content Creation: Media producers can revolutionize their workflows by leveraging speech generation for use in video games, podcasts, and film studios.
Real-Time Translation: Industries such as healthcare and legal services can benefit from real-time audio translation, breaking down language barriers and fostering better communication in critical contexts.

Ready to get started?

Learn more about Azure OpenAI Service
Try it out with Azure AI Foundry

View the full article

Sign In

Introducing the GPT-4o-Mini Audio Models: Adding More Choice to Audio-Enhanced AI Interaction

Recommended Posts

Windows Server

Key Benefits

Detailed Features

Real-World Applications

Join the conversation

Browse

Activity