Resources

Adding `safetensors` variant of this model

#27 opened over 2 years ago by

SFconvertbot

Do we have a plan on posting the evaluation results to `open_llm_leaderboard`

#26 opened over 2 years ago by

mpsk

Context length schedule and performance

#25 opened over 2 years ago by

baffo32

Adding `safetensors` variant of this model

#24 opened over 2 years ago by

SFconvertbot

HF version

#23 opened almost 3 years ago by

edmond

Pretraining hyperparameters?

#21 opened almost 3 years ago by

PY007

How to run on colab's CPU?

#20 opened almost 3 years ago by

deepakkaura26

Qlora finetuning

#19 opened almost 3 years ago by

TinyPixel

Why need get_mup_param_groups instead of default one in Huggingface?

#18 opened almost 3 years ago by

sanqiang

No Cuda Information / nvidia-smi / nvtop

#17 opened almost 3 years ago by

nudelbrot

How to reproduce quantized memory usage?

#16 opened almost 3 years ago by

tarasglek

What is the inference time? On my Apple M1 Max completions take > 6 min

#15 opened almost 3 years ago by

vedtam

Fine-tuning on coding tasks

#14 opened almost 3 years ago by

sgaseretto

Your 3b model is very exciting and proves that data improvement works!

#13 opened almost 3 years ago by

win10

Any plans on releasing GPTQ or GGML versions of this?

👍 6

#12 opened almost 3 years ago by

FriendlyVisage

why we can not make this fully HF ready?

#11 opened almost 3 years ago by

CUIGuy

LoraConfig's target_modul with peft ?

#10 opened almost 3 years ago by

Handgun1773

include fastchat-t5 in the benchmark which is also 3B parameter

👍 3

#9 opened almost 3 years ago by

vasilee

Recommendations for additional pretraining?

#8 opened almost 3 years ago by

ZQ-Dev