LLaMA-based Math Question Difficulty Classifier

This model classifies the difficulty of math questions on a 1-3 scale:

Level 1: Direct application of definitions; minimal reasoning
Level 2: One extra reasoning step (e.g., convert a word problem)
Level 3: Multi-step reasoning; multiple concepts and harder mathematical skill

Training Details

Fine-tuned from Meta-Llama-3-8B-Instruct using a two-stage approach:

Head-only warmup (frozen backbone)
Full fine-tuning with class-weighted CrossEntropy

The model was trained using a prompt template that frames the task as classification without considering the grade level of mathematical concepts.

Usage

from transformers import AutoTokenizer, LlamaForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/math-difficulty-classifier")
model = LlamaForSequenceClassification.from_pretrained("YOUR_USERNAME/math-difficulty-classifier")

# Prepare input text
text = """Classify the difficulty level (1–3) of the following math question. Remember this classification of difficutly is agnostic of the concepts. Only consider the difficulty of the question itself, not if the concept is something only learned in higher grades. For example, a simple integral is the same difficulty as a simple trig calculation. 
- Level 1: direct application of definitions; minimal reasoning.
- Level 2: one extra reasoning step (e.g., convert a word problem).
- Level 3: multi-step reasoning; multiple concepts and harder mathematical skil.

Question:
What is the slope of the line passing through the points (3, 7) and (5, 11)?

Difficulty (1–3):"""

inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
predicted_class = outputs.logits.argmax().item()
difficulty = predicted_class + 1  # Convert from 0-indexed to 1-indexed
print(f"Predicted difficulty: {difficulty}")

Limitations

This model is specifically designed for math questions and may not generalize well to other domains.

Downloads last month: 4

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support