openslr/librispeech_asr
Viewer • Updated • 585k • 105k • 228
How to use patrickvonplaten/wav2vec2-base-100h-2nd-try with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="patrickvonplaten/wav2vec2-base-100h-2nd-try") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("patrickvonplaten/wav2vec2-base-100h-2nd-try")
model = AutoModelForCTC.from_pretrained("patrickvonplaten/wav2vec2-base-100h-2nd-try")Second fine-tuning try of wav2vec2-base. Results are similar to the ones reported in https://huggingface.co/facebook/wav2vec2-base-100h.
Model was trained on librispeech-clean-train.100 with following hyper-parameters:
Check: https://wandb.ai/patrickvonplaten/huggingface/runs/1yrpescx?workspace=user-patrickvonplaten
Result (WER) on Librispeech:
| "clean" (% rel difference to results in paper) | "other" (% rel difference to results in paper) |
|---|---|
| 6.2 (-1.6%) | 15.2 (-11.2%) |