OpenAI Charges by the Minute, So Make the Minutes Shorter

This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).

A developer discovered that speeding up audio files 2x or 3x before sending them to OpenAI's transcription APIs significantly reduces costs and processing time while maintaining transcription quality. Using ffmpeg to accelerate audio reduces the duration-based pricing for whisper-1 and token-based pricing for gpt-4o-transcribe models. Testing showed 2x speed saves about 23% on costs, while 3x speed provides even better savings. The technique works because AI models, like human brains, can handle compressed audio information effectively, though 4x speed produces unusable results.

10m read timeFrom george.mand.is
Post cover image
Table of contents
I Just Wanted the TL;DW(atch)My Transcription WorkflowTesting OpenAI’s Transcription ToolsLet's Try Something ObviousWhy This Works: Our Brains Forgive, and So Does AIWait—how far can I push this? Does It Actually Save Money?Does This Save Money?Is It Accurate?Why Not 4x?In SummaryTL;DR

Sort: