Whisper.cpp Core ML Models for Apple Silicon

This repository provides prebuilt Core ML models for Whisper.cpp, optimized for Apple Silicon (M1, M2, M3, M4, M5) devices.
These models enable hardware-accelerated speech-to-text using Apple’s Neural Engine via Core ML.

The repository is designed for easy plug-and-play usage with Whisper.cpp using a prebuilt CLI binary.

📦 Repository Contents

Each model directory contains everything required to run Whisper.cpp with Core ML acceleration:

ggml-*.bin – Whisper model weights used by whisper.cpp
*-encoder.mlmodelc/ – Compiled Core ML encoder bundle

⚠️ Important: The .mlmodelc directories must remain intact. Do not modify, rename, or move their contents.

📁 Folder Structure

.
├── tiny/
├── tiny.en/
├── base/
├── base.en/
├── small/
├── small.en/
├── medium/
├── medium.en/
├── large-v1/
├── large-v2/
└── large-v3/

Each folder corresponds to a Whisper model variant and contains the matching .bin and .mlmodelc files.

🚀 Model Variants Overview

The table below summarizes the trade-offs between speed, accuracy, and memory usage.

Folder	Model Size	Speed	Accuracy	Notes
`tiny`	Very Small	⚡ Fastest	⭐ Lowest	Best for real-time, low-resource use
`tiny.en`	Very Small	⚡ Fastest	⭐ Lowest	English-only
`base`	Small	⚡ Fast	⭐⭐ Balanced	Good default choice
`base.en`	Small	⚡ Fast	⭐⭐ Balanced	English-only
`small`	Medium	⚡ Medium	⭐⭐⭐ Better	Improved transcription quality
`small.en`	Medium	⚡ Medium	⭐⭐⭐ Better	English-only
`medium`	Large	🐢 Slower	⭐⭐⭐⭐ High	High accuracy, higher memory usage
`medium.en`	Large	🐢 Slower	⭐⭐⭐⭐ High	English-only
`large-v1`	Very Large	🐢 Slow	⭐⭐⭐⭐⭐ Best	Maximum accuracy
`large-v2`	Very Large	🐢 Slow	⭐⭐⭐⭐⭐ Best	Improved multilingual performance
`large-v3`	Very Large	🐢 Slow	⭐⭐⭐⭐⭐ Best	Latest and most accurate

🛠 Usage Instructions

1. Download the Prebuilt `whisper-cli` Binary

Download the Core ML–enabled whisper-cli binary directly from GitHub Releases:

https://github.com/aarush67/whisper-cli-for-core-ml/releases/download/v1.0.0/whisper-cli

Recommended directory structure:

.
├── bin/
│   └── whisper-cli
├── medium.en/
│   ├── ggml-medium.en.bin
│   └── medium.en-encoder.mlmodelc/

2. Make the Binary Executable

chmod +x bin/whisper-cli

3. Run Whisper with a Core ML Model

Place the .bin file and the matching .mlmodelc folder in the same directory.

Example (English-only transcription)

./bin/whisper-cli -m ggml-medium.en.bin -f sample.wav

Whisper.cpp will automatically detect and use the Core ML encoder when available.

🧠 Best Practices & Notes

✅ Keep .bin and .mlmodelc files together from the same model variant
❌ Do not rename, edit, or partially copy .mlmodelc directories
🧹 .pt cache files (if generated) are temporary and safe to delete
💾 Larger models require significantly more RAM and disk space
⚡ Best performance is achieved on Apple Silicon devices with a Neural Engine

📜 License

Repository structure & metadata: MIT License
Model weights & Core ML artifacts: Governed by the original
Whisper.cpp / OpenAI Whisper licenses

Please review upstream licenses before commercial or large-scale use.

🙏 Credits

OpenAI Whisper – Original speech recognition models
whisper.cpp by Georgi Gerganov – High-performance C/C++ implementation
Apple Core ML – Hardware acceleration on Apple Silicon

⭐ Support

If this repository is useful to you:

⭐ Star the repository
🔗 Support the upstream Whisper.cpp project

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support