Parakeet TDT v3 โ CoreML INT8
CoreML conversion of NVIDIA Parakeet-TDT 0.6B v2 with INT8-quantized encoder for Apple Neural Engine acceleration.
Models
| Model | Description | Compute | Quantization |
|---|---|---|---|
encoder.mlmodelc |
FastConformer encoder (24L, 1024 hidden) | CPU + Neural Engine | INT8 palettized |
decoder.mlmodelc |
LSTM prediction network (2L, 640 hidden) | CPU + Neural Engine | FP16 |
joint.mlmodelc |
TDT dual-head joint (token + duration logits) | CPU + Neural Engine | FP16 |
Additional Files
| File | Description |
|---|---|
vocab.json |
SentencePiece vocabulary (1024 tokens) |
config.json |
Model configuration |
Notes
- INT8 vs INT4: INT8 uses 8-bit palettization for the encoder, offering higher accuracy than INT4 at the cost of ~2x encoder weight size.
- Mel preprocessing is done in Swift using Accelerate/vDSP (not CoreML) because
torch.stfttracing bakes audio length as a constant, breaking per-feature normalization for variable-length inputs. - Encoder uses
EnumeratedShapes(100โ3000 mel frames, covering 1โ30s audio) to avoid BNNS crashes with dynamic shapes.
Usage
Used by speech-swift ParakeetASR module:
let model = try await ParakeetASRModel.fromPretrained(modelId: ParakeetASRModel.int8ModelId)
let text = try model.transcribeAudio(samples, sampleRate: 16000)
- Guide: soniqo.audio/guides/parakeet
- Docs: soniqo.audio
- GitHub: soniqo/speech-swift
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for playstonex/Parakeet-TDT-v3-CoreML-INT8
Base model
nvidia/parakeet-tdt-0.6b-v2