Llama-3.2-1B-4B-Quad-MoE

This is a Mixture of Experts (MoE) model based on unsloth/Llama-3.2-1B-Instruct. It merges a base model, an instruct model, a coding specialist, and a math specialist into a single sparse architecture.

Model Details

  • Total Parameters: 3.65B
  • Active Parameters per Token: 1.24B
  • Base Model: unsloth/Llama-3.2-1B-Instruct
  • Experts:
    1. unsloth/Llama-3.2-1B-Instruct (General Purpose)
    2. unsloth/Llama-3.2-1B (Creative/Prose)
    3. cutelemonlili/Llama3.2-1B-Instruct_Lean_Code (Coding Specialist)
    4. prithivMLmods/Llama-Express.1-Math (Logic/Mathematics)

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Fu01978/Llama-3.2-1B-4B-Quad-MoE")
tokenizer = AutoTokenizer.from_pretrained("Fu01978/Llama-3.2-1B-4B-Quad-MoE")
Downloads last month
44
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Fu01978/Llama-3.2-1B-4B-Quad-MoE