Llama-3.2-1B-4B-Quad-MoE

This is a Mixture of Experts (MoE) model based on unsloth/Llama-3.2-1B-Instruct. It merges a base model, an instruct model, a coding specialist, and a math specialist into a single sparse architecture.

Model Details

Total Parameters: 3.65B
Active Parameters per Token: 1.24B
Base Model: unsloth/Llama-3.2-1B-Instruct
Experts:
1. unsloth/Llama-3.2-1B-Instruct (General Purpose)
2. unsloth/Llama-3.2-1B (Creative/Prose)
3. cutelemonlili/Llama3.2-1B-Instruct_Lean_Code (Coding Specialist)
4. prithivMLmods/Llama-Express.1-Math (Logic/Mathematics)

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Fu01978/Llama-3.2-1B-4B-Quad-MoE")
tokenizer = AutoTokenizer.from_pretrained("Fu01978/Llama-3.2-1B-4B-Quad-MoE")

Downloads last month: 44

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for Fu01978/Llama-3.2-1B-4B-Quad-MoE

cutelemonlili/Llama3.2-1B-Instruct_Lean_Code

prithivMLmods/Llama-Express.1-Math

unsloth/Llama-3.2-1B

unsloth/Llama-3.2-1B-Instruct

Merge model

this model

Quantizations

3 models