mrm8488
/

spanish-gpt2

Text Generation

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

Spanish GPT-2 trained on large_spanish_corpus

This is a Spanish GPT-2 model trained from scratch on the large_spanish_corpus aka BETO's corpus with Flax This is part of the Flax/Jax Community Week, organised by HuggingFace and TPU usage sponsored by Google.

Dataset

The dataset is about 20 GB. 95% of the data was used for training and the rest 5% for validation.

Metrics (on evaluation dataset)

Loss: 2.413
Perplexity: 11.36

Team members

Manuel Romero (mrm8488)
María Grandury (mariagrandury)
Pablo González de Prado (Pablogps)
Daniel Vera (daveni)
Sri Lakshmi (srisweet)
José Posada (jdposa)
Santiago Hincapie (shpotes)
Jorge (jorgealro)

Useful links

Downloads last month: 2,965

Safetensors

Model size

0.1B params

Tensor type

F32

·

U8

·

Model tree for mrm8488/spanish-gpt2

Finetunes

Dataset used to train mrm8488/spanish-gpt2

Spaces using mrm8488/spanish-gpt2 25