Instructions to use csebuetnlp/banglat5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use csebuetnlp/banglat5 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("csebuetnlp/banglat5") model = AutoModelForSeq2SeqLM.from_pretrained("csebuetnlp/banglat5") - Notebooks
- Google Colab
- Kaggle
100% gpu memory usage
Hello,
First of all thank you guys for making this repo and all of your hard works.
I was using this model to train a Seq2SeqTrainer and hitting the memory limit (3060 12gb), which was not the case for t5 and mt5. I have been using the same training args for all of the cases except the batch_size = 64/48/32 but for banglat5 I had to set the batch_size = 16,
Is there anyway to optimize the gpu memory usage?
Thank you,
Possibly a tokenization issue since banglat5 has the exact same architecture as t5. In fact, banglat5 should have lower memory requirements because the banglat5 tokenizer creates a lower number of tokens than mt5 given the same Bangla text.
Check if you're using the right tokenizer (the one in this repo)
Maybe worthwhile to explicitly set max_length, truncation, and padding variables when calling the tokenizer.
Good Luck.