What is OpenAI gpt-audio-1.5?

gpt-audio-1.5 is OpenAI's upgraded speech-in / speech-out model, released 2026-04-23 alongside GPT-5.5. It accepts audio input and emits audio output through the Realtime + Audio API, with lower latency and stronger expressive control than the original gpt-audio.

How is gpt-audio-1.5 different from Gemini 3.1 Flash TTS?

Both target Flash-tier-style economics. gpt-audio-1.5 is a unified speech-in / speech-out model strongest on natural conversation and dubbing; Gemini 3.1 Flash TTS focuses on steerable narration with explicit emotion and pacing controls. Conversational / agent / live workloads suit gpt-audio-1.5; long-form narration and explainer content suit Flash TTS.

Does BibiGPT natively integrate gpt-audio-1.5 today?

This page is an event-landing guide. The BibiGPT team is evaluating native integration. In the meantime, export BibiGPT's translated subtitles, AI summary script, or Q&A transcript and call gpt-audio-1.5 directly via the OpenAI Audio API — the workflow already runs end to end.

Why does gpt-audio-1.5 matter for short-form video creators?

Short-form thrives on fast iteration and multi-language delivery. gpt-audio-1.5's lower latency and expressive control let creators redub a single 30-second clip into multiple languages and voice styles in minutes. BibiGPT supplies the translated, chaptered script; gpt-audio-1.5 supplies the voice.

Which BibiGPT pages connect well to this?

BibiGPT's video-to-text, AI subtitle translation, and podcast summarization features generate the script, subtitles, and chapters that gpt-audio-1.5 narrates. Combine them with BibiGPT's auto mind-map and Notion / Obsidian export and you have an end-to-end content production pipeline.

Is this an OpenAI launch announcement?

No. We aggregate what OpenAI published on 2026-04-23 in the OpenAI API model docs / changelog and translate it into practical BibiGPT workflows. For OpenAI's official model details and pricing, follow the link to the OpenAI API model page from the CTA below.

OpenAI gpt-audio-1.5 × BibiGPT

On 2026-04-23 OpenAI shipped gpt-audio-1.5 alongside GPT-5.5 — an upgraded speech-in / speech-out model with lower latency and richer expression. BibiGPT pipes its multilingual subtitles, summaries, and podcast scripts straight into gpt-audio-1.5 to produce ready-to-publish video narration without a recording booth.

Generate narration scripts in BibiGPT

Released · 2026-04-23 Speech-in / speech-out Ships with GPT-5.5

Key facts (90-second read)

OpenAI released gpt-audio-1.5 on 2026-04-23 alongside GPT-5.5 — a unified speech-in / speech-out model with lower latency and richer expressive control than gpt-audio. Pair it with BibiGPT's multilingual subtitles, AI summaries, and chaptered transcripts and you get an end-to-end video narration / dubbing / summary-to-podcast pipeline without booking voiceover talent.

What is gpt-audio-1.5?

gpt-audio-1.5 is OpenAI's upgraded speech-in / speech-out model launched on 2026-04-23 alongside GPT-5.5. Same Realtime + Audio API surface, lower latency and stronger expressive control than gpt-audio.

Speech-in / speech-out in one model

Accept audio input and emit audio output without bouncing through a separate ASR + TTS stack. Cuts round-trip latency for live narration, dubbing, and conversational flows.

Tunable voice and expression

Inherits gpt-audio's expressive style controls and adds finer-grained pacing and emphasis steering — closer to studio narration without re-recording takes.

Released with GPT-5.5

Ships alongside the GPT-5.5 reasoning upgrade on 2026-04-23. Pair gpt-audio-1.5 for narration with GPT-5.5 for the underlying script and you stay inside one OpenAI stack.

Why it matters for BibiGPT users

BibiGPT already turns Bilibili / YouTube / podcasts into multilingual scripts, subtitles, and summaries. gpt-audio-1.5 is the missing last mile for narration, dubbing, and summary-to-podcast workflows.

Subtitle-driven AI narration

Pipe BibiGPT's translated subtitles or AI summary scripts into gpt-audio-1.5 and ship a redubbed video in zh / en / ja / ko without booking a voiceover artist or studio.

Long video to short narrated clip

Use BibiGPT to generate chapter highlights from a 60-minute lecture, then narrate just the highlight chunk through gpt-audio-1.5 — short-form social posts shipped in minutes.

Summary-to-podcast pipeline

Turn a BibiGPT-generated summary or follow-up Q&A into a hosted podcast episode. gpt-audio-1.5 handles the voice; BibiGPT handles the script, chaptering, and translation.

5 key changes (90-second read)

All sourced from the OpenAI API model docs and the 2026-04-23 release alongside GPT-5.5.

1

Released 2026-04-23 with GPT-5.5

gpt-audio-1.5 ships the same day as GPT-5.5 (codename Spud). Audio + Realtime API users picked it up day one; pricing and availability published in the OpenAI API model docs.
2

Speech-in / speech-out unified

One model handles both audio input understanding and audio output generation, removing the ASR + TTS round trip. Simpler stacks for live agents, dubbing, and conversational replies.
3

Lower latency than gpt-audio

Latency improvements compared to the original gpt-audio at the same expressive quality — better for real-time narration loops and live podcast / interview workflows.
4

Stronger expression and steering

Finer-grained pacing, emphasis, and emotion control versus gpt-audio. Same script can land as serious / playful / casual without re-recording takes.
5

Pairs with the GPT-5.5 reasoning upgrade

GPT-5.5 generates the script (Terminal-Bench 2.0 at 82.7%, FrontierMath at 35.4%); gpt-audio-1.5 narrates it. End-to-end OpenAI stack for narrated explainers, agent-driven dubbing, and summary podcasts.

3 typical scenarios for BibiGPT users

Grounded in real BibiGPT user personas; all already actionable today via the OpenAI Audio / Realtime API.

General creators — AI dubbing

Run a YouTube / Bilibili video through BibiGPT for translated subtitles in zh / en / ja / ko, then narrate the translated track via gpt-audio-1.5. One source video, four-language redub, no studio.

BibiGPT users — long video to short narrated clip

Students, teachers, and creators feed lecture or course videos into BibiGPT for chapter segmentation + highlight summaries, then narrate just the highlight chunks through gpt-audio-1.5 for short-form social posts.

Advanced combo — summary to podcast

BibiGPT summarizes a podcast episode or research video into a structured script → GPT-5.5 polishes and adds host / guest segments → gpt-audio-1.5 narrates → ship a recap podcast, entirely inside the OpenAI + BibiGPT stack.

Loved by creators, students & researchers

Why people use BibiGPT to turn videos into text every day.

Trusted by 50,000+ users worldwide

★★★★★

“I paste a link and get clean captions in seconds — it saves me hours of retyping every single week.”

Maya R.

Content Creator · Repurposes short videos

★★★★★

“Exporting the transcript lets me review new words at my own pace instead of pausing the video constantly.”

Daniel K.

Language Learner · Studies with real videos

★★★★★

“Accurate, timestamped text I can quote directly. It has quietly become part of my daily workflow.”

Priya S.

Researcher · Cites public talks

FAQ'S

Frequently Asked Questions

Ask us anything!

Popular guides

Bilibili AI Video Summary Tool: BibiGPT Summarizes 30+ Platforms Instantly (2026)

Best Bilibili AI video summary tool 2026? Paste a link for summary, mind map, and highlights on 30+ platforms — free tier to start.

Bilibili Transcript Tools Compared: Best Subtitle Extractors in 2026

Looking for the best bilibili transcript tool? We compare 5 top subtitle extractors for Bilibili videos — from free downloaders to AI-powered tools like BibiGPT that handle transcription, translation, and summarization.

OpenClaw + BibiGPT Skill 2026: AI Video Summary for Bilibili, Xiaohongshu & 30+ Platforms

OpenClaw can't summarize Bilibili/Douyin alone. Install bibigpt-skill once and summarize 30+ video platforms inside Claude Code — free to try.

Turn any video into narration-ready scripts with BibiGPT

BibiGPT summarizes YouTube, Bilibili, and podcasts into multilingual scripts and subtitles. Plug the output into OpenAI gpt-audio-1.5 (Audio / Realtime API) and you get publish-ready narration. No custom stack, no learning curve.

Try BibiGPT free

OpenAI gpt-audio-1.5 × BibiGPT

Key facts (90-second read)

Features

What is gpt-audio-1.5?

Speech-in / speech-out in one model

Tunable voice and expression

Released with GPT-5.5

Why it matters for BibiGPT users

Subtitle-driven AI narration

Long video to short narrated clip

Summary-to-podcast pipeline

5 key changes (90-second read)

Released 2026-04-23 with GPT-5.5

Speech-in / speech-out unified

Lower latency than gpt-audio

Stronger expression and steering

Pairs with the GPT-5.5 reasoning upgrade

3 typical scenarios for BibiGPT users

General creators — AI dubbing

BibiGPT users — long video to short narrated clip

Advanced combo — summary to podcast

Loved by creators, students & researchers

Frequently Asked Questions

More Free Tools

Gemini Flash TTS × BibiGPT

OpenClaw × BibiGPT Skill

NotebookLM 2026 Update × BibiGPT

Cohere Transcribe 03-2026 × BibiGPT

Popular guides

Bilibili AI Video Summary Tool: BibiGPT Summarizes 30+ Platforms Instantly (2026)

Bilibili Transcript Tools Compared: Best Subtitle Extractors in 2026

OpenClaw + BibiGPT Skill 2026: AI Video Summary for Bilibili, Xiaohongshu & 30+ Platforms

Turn any video into narration-ready scripts with BibiGPT