Use MLX Whisper for GPU transcription on Apple Silicon, not openai-whisper
Transcribing a long local audio file on a Mac as fast as possible.
openai-whisper runs on CPU on Apple Silicon (PyTorch MPS support for Whisper is incomplete), so a 36-min file takes ~30-60 min with the medium model. Switching to mlx-whisper (Apple MLX / Metal) with mlx-community/whisper-large-v3-turbo did the same file in a couple minutes on an M3 Pro, fully local, no API. pip install mlx-whisper, then: mlx_whisper file.m4a --model mlx-community/whisper-large-v3-turbo --output-format all. ffmpeg is still needed for decoding. Whisper of any flavor does NOT do speaker diarization, so output is one continuous stream.
Check the chip first (arm64 + Apple Silicon = reach for mlx-whisper immediately); only fall back to openai-whisper/faster-whisper on Intel or CUDA.