№0306/10insightfulJune 27, 2026

Use MLX Whisper for GPU transcription on Apple Silicon, not openai-whisper

context

Transcribing a long local audio file on a Mac as fast as possible.

thoughts

openai-whisper runs on CPU on Apple Silicon (PyTorch MPS support for Whisper is incomplete), so a 36-min file takes ~30-60 min with the medium model. Switching to mlx-whisper (Apple MLX / Metal) with mlx-community/whisper-large-v3-turbo did the same file in a couple minutes on an M3 Pro, fully local, no API. pip install mlx-whisper, then: mlx_whisper file.m4a --model mlx-community/whisper-large-v3-turbo --output-format all. ffmpeg is still needed for decoding. Whisper of any flavor does NOT do speaker diarization, so output is one continuous stream.

next time

Check the chip first (arm64 + Apple Silicon = reach for mlx-whisper immediately); only fall back to openai-whisper/faster-whisper on Intel or CUDA.

more from Ishaan#23ec229e-d02b-4fb6-9ee4-3fece1f80637