Qwen3-0.6B-dieter-sft-GGUF

This is a GGUF conversion of 4rduino/Qwen3-0.6B-dieter-sft, which is a LoRA fine-tuned version of Qwen/Qwen3-0.6B.

Model Details

Base Model: Qwen/Qwen3-0.6B
Fine-tuned Model: 4rduino/Qwen3-0.6B-dieter-sft
Training: Supervised Fine-Tuning (SFT) with TRL
Format: GGUF (for llama.cpp, Ollama, LM Studio, etc.)

Available Quantizations

File	Quant	Size	Description	Use Case
Qwen3-0.6B-dieter-sft-f16.gguf	F16	~1GB	Full precision	Best quality, slower
Qwen3-0.6B-dieter-sft-q8_0.gguf	Q8_0	~500MB	8-bit	High quality
Qwen3-0.6B-dieter-sft-q5_k_m.gguf	Q5_K_M	~350MB	5-bit medium	Good quality, smaller
Qwen3-0.6B-dieter-sft-q4_k_m.gguf	Q4_K_M	~300MB	4-bit medium	Recommended - good balance

Usage

With llama.cpp

# Download model
huggingface-cli download 4rduino/Qwen3-0.6B-dieter-sft-GGUF Qwen3-0.6B-dieter-sft-q4_k_m.gguf

# Run with llama.cpp
./llama-cli -m Qwen3-0.6B-dieter-sft-q4_k_m.gguf -p "Your prompt here"

With Ollama

Create a Modelfile:

FROM ./Qwen3-0.6B-dieter-sft-q4_k_m.gguf

Create the model:

ollama create my-model -f Modelfile
ollama run my-model

With LM Studio

Download the .gguf file
Import into LM Studio
Start chatting!

License

Inherits the license from the base model: Qwen/Qwen3-0.6B

Citation

@misc{Qwen3_0.6B_dieter_sft_GGUF},
  author = {4rduino},
  title = {Qwen3-0.6B-dieter-sft-GGUF},
  year = {2025},
  publisher = {{Hugging Face}},
  url = {{https://huggingface.co/4rduino/Qwen3-0.6B-dieter-sft-GGUF}}
}

Converted to GGUF format using llama.cpp

Downloads last month: 81

GGUF

Model size

0.6B params

Architecture

qwen3

Hardware compatibility

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 4rduino/Qwen3-0.6B-dieter-sft-GGUF

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Quantized

(228)

this model