mozilla-foundation/common_voice_17_0
Updated • 5.5k • 16
How to use kingabzpro/whisper-tiny-urdu with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="kingabzpro/whisper-tiny-urdu") # Load model directly
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
processor = AutoProcessor.from_pretrained("kingabzpro/whisper-tiny-urdu")
model = AutoModelForSpeechSeq2Seq.from_pretrained("kingabzpro/whisper-tiny-urdu")This model is a fine-tuned version of openai/whisper-tiny on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:
from transformers import pipeline
transcriber = pipeline(
"automatic-speech-recognition",
model="kingabzpro/whisper-tiny-urdu"
)
transcriber.model.generation_config.forced_decoder_ids = None
transcriber.model.generation_config.language = "ur"
transcription = transcriber("audio2.mp3")
print(transcription)
{'text': 'دیکھیے پانی کب تک بہتا اور مچھلی کب تک تیرتی ہے'}
| Dataset | WER (%) | CER (%) | BLEU | ChrF |
|---|---|---|---|---|
| Common Voice 17.0 (Urdu) | 46.908 | 18.543 | 32.631 | 63.988 |
| HowMannyMore/urdu-audiodataset | 51.405 | 21.830 | 31.475 | 64.204 |
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 0.6808 | 1.6949 | 500 | 0.7403 | 52.6699 |
| 0.3948 | 3.3898 | 1000 | 0.6850 | 47.1247 |
| 0.2873 | 5.0847 | 1500 | 0.6994 | 48.1516 |
| 0.2024 | 6.7797 | 2000 | 0.7169 | 46.7326 |
| 0.183 | 8.4746 | 2500 | 0.7225 | 47.8529 |
Base model
openai/whisper-tiny