Instructions to use recursal/radlads-7b-various with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use recursal/radlads-7b-various with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="recursal/radlads-7b-various")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("recursal/radlads-7b-various", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use recursal/radlads-7b-various with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "recursal/radlads-7b-various" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "recursal/radlads-7b-various", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/recursal/radlads-7b-various
- SGLang
How to use recursal/radlads-7b-various with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "recursal/radlads-7b-various" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "recursal/radlads-7b-various", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "recursal/radlads-7b-various" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "recursal/radlads-7b-various", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use recursal/radlads-7b-various with Docker Model Runner:
docker model run hf.co/recursal/radlads-7b-various
Training Instructions mising 'qwen2' package
#2
by codys12 - opened
I tried training with the provided instructions, but I get this error:
INFO: ########## work in progress ##########
INFO:lightning.pytorch.utilities.rank_zero:########## work in progress ##########
INFO:
############################################################################
#
# Model qwerky7_qwen2 BF16 on 1x8 GPU, bsz 1x8x12=96, fsdp with grad_cp
#
# Data = data/dclm-10B (binidx), ProjDir = out/L28-D3584-qwerky7_qwen2-4
#
# Epoch = 0 to 71 (will continue afterwards), save every 1 epoch
#
# Each "epoch" = 420 global steps, 40320 samples, 20643840 tokens
#
# Model = 28 n_layer, 3584 n_embd, 512 ctx_len
#
# Adam = lr 7e-06 to 7e-06, warmup 50 steps, beta (0.9, 0.95), eps 1e-08
#
# Found torch 2.7.0+cu126, recommend latest torch
# Found deepspeed None, recommend latest deepspeed
# Found lightning 2.5.1.post0, requires 2+
#
############################################################################
INFO:lightning.pytorch.utilities.rank_zero:
############################################################################
#
# Model qwerky7_qwen2 BF16 on 1x8 GPU, bsz 1x8x12=96, fsdp with grad_cp
#
# Data = data/dclm-10B (binidx), ProjDir = out/L28-D3584-qwerky7_qwen2-4
#
# Epoch = 0 to 71 (will continue afterwards), save every 1 epoch
#
# Each "epoch" = 420 global steps, 40320 samples, 20643840 tokens
#
# Model = 28 n_layer, 3584 n_embd, 512 ctx_len
#
# Adam = lr 7e-06 to 7e-06, warmup 50 steps, beta (0.9, 0.95), eps 1e-08
#
# Found torch 2.7.0+cu126, recommend latest torch
# Found deepspeed None, recommend latest deepspeed
# Found lightning 2.5.1.post0, requires 2+
#
############################################################################
INFO: {}
INFO:lightning.pytorch.utilities.rank_zero:{}
INFO: [rank: 0] Seed set to 1337
INFO:lightning.fabric.utilities.seed:[rank: 0] Seed set to 1337
[2025-06-01 14:58:04,031] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Warning: The cache directory for DeepSpeed Triton autotune, /home/steinmetzc/.triton/autotune, appears to be on an NFS system. While this is generally acceptable, if you experience slowdowns or hanging when DeepSpeed exits, it is recommended to set the TRITON_CACHE_DIR environment variable to a non-NFS path.
RWKV_MODEL_TYPE qwerky7
Traceback (most recent call last):
from qwen2.configuration_qwen2 import Qwen2Config
ModuleNotFoundError: No module named 'qwen2'
This looks like some other kind of training on a very large amount of data? Not quite sure if this is a conversion or something else. If you're using the radlads training code, please file an issue there rather than on the huggingface models and we will do our best to address it. Thanks!
SmerkyG changed discussion status to closed
Sorry, I just realized we did not have issues enabled on the code repo! I've enabled that there if you'd like to follow up on this.