Prism-Coder 7B (v18clean-epoch0) — Function Calling + AAC Multitask

A fine-tuned Qwen2.5-Coder-7B-Instruct for the Synalux platform: function calling for the Prism Memory MCP server, AAC (Augmentative and Alternative Communication) tasks for the PrismAAC consumer app, and multi-language clinical/communication assistance.

Released 2026-05-04, replacing the prior v18aac-MAX production model. Sibling: prism-coder-14b for paid-tier medium queries.

What changed vs the previous prod (v18aac-MAX)

Metric (Prism internal eval, 3-run StdDev 0%) Previous prod v18clean-epoch0 Δ
BFCL (Prism 64-test) 47.2% 88.1% +40.9pp
AAC realigned (48 cases) 47/48 (97.9%) 47/48 (97.9%) held
Caregiver targeted (boost_word + reorder) 19/20 20/20 +1
emergency_qa 13/13 13/13 held
text_correct 15/15 15/15 held
translate 7/8 8/8 +1
ask_ai 5/5 5/5 held

The previous prod had over-fit to AAC on the BFCL axis. v18clean-epoch0 is full SFT from clean Qwen2.5-Coder-7B-Instruct with a balanced curriculum: 5,450 BFCL + 1,500 caregiver + 294 text_correct + 250 emergency + 200 format-anchor + 20 ask_ai. Single epoch, LR 1e-5 cosine, DoRA r=128 / alpha=256.

Official Berkeley BFCL V4

A handler PR is open against the Gorilla repo: ShishirPatil/gorilla#1332. Local self-run via the official toolkit is in progress; partial results so far:

  • Non-Live AST: 79.48% (Simple Python 92.25%, Multiple 90%, Parallel 79%, ParallelMultiple 82.5%, Java 49%, JavaScript 58%)
  • Live AST: 73.28%
  • Relevance Detection: 100%
  • Irrelevance Detection: 64.58%

Full Overall + Multi-Turn / Memory / Web Search categories will be appended once the run completes. Public leaderboard listing pending PR merge.

Use cases

PrismAAC consumer app

On-device function-calling backbone for an AAC app used by nonverbal children, adults with motor impairments, and BCBA caregivers. Drives:

  • Caregiver instruction parsing into structured AAC actions (add_phrase, boost_word, reorder_phrase, note_only, etc.)
  • Text correction for users with motor-impaired typing
  • Emergency QA (911-call assistance for nonverbal users — 13/13 operator-question accuracy)
  • Translation across 12 languages
  • Open-ended Q&A scoped to AAC vocabulary

Synalux portal

Tier-aware routing handles standard/advanced/enterprise traffic locally on this 7B (free queries route to cloud Gemini for complex). At 10K-user scale this saves an estimated $190K-210K/year vs all-cloud routing.

Format

This repo provides:

  • model.safetensors — single-file BF16 (15.2 GB), loadable via transformers, vLLM, sglang
  • config.json, generation_config.json, tokenizer.json, chat_template.jinja — standard HF assets
  • merged/ — earlier sharded snapshot, kept for reproducibility (will be removed when this README is finalized)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tok = AutoTokenizer.from_pretrained("dcostenco/prism-coder-7b")
m = AutoModelForCausalLM.from_pretrained(
    "dcostenco/prism-coder-7b",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
prompt = tok.apply_chat_template(
    [{"role": "user", "content": "Add 'eat apples' to the food category."}],
    tokenize=False,
    add_generation_prompt=True,
)
inputs = tok(prompt, return_tensors="pt").to(m.device)
out = m.generate(**inputs, max_new_tokens=120, temperature=0.3)
print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

For Ollama users, a Q4_K_M GGUF is available via the prism-coder:7b tag in the Synalux ops fleet (build instructions in the PR).

Training

  • Base: Qwen/Qwen2.5-Coder-7B-Instruct
  • Method: Full DoRA SFT from clean base (NOT continuation of any prior fine-tune)
  • Adapter: r=128, alpha=256, lora_dropout=0.05, target_modules q/k/v/o/gate/up/down_proj
  • Schedule: 1 epoch, LR 1e-5 cosine, warmup 5%, batch 32 effective (per_device 4 × grad_accum 8)
  • Curriculum: 5,450 BFCL + 1,500 caregiver + 294 text_correct + 250 emergency + 200 format-anchor + 20 ask_ai
  • Compute: H100×2 on Modal, ~3h total
  • Ablation: trained 4 epochs with per-epoch saves; epoch_0 (this checkpoint) was Pareto-optimal — later epochs over-fit caregiver while losing BFCL

License

Apache 2.0. Free for research and commercial use.

Citation

@misc{prism-coder-7b-v18clean-2026,
  title         = {Prism-Coder 7B v18clean-epoch0: Balanced Function Calling + AAC Fine-Tune of Qwen2.5-Coder-7B},
  author        = {Synalux AI / Dmitri Costenco},
  year          = {2026},
  month         = {May},
  url           = {https://huggingface.co/dcostenco/prism-coder-7b},
  note          = {Companion 14B model: https://huggingface.co/dcostenco/prism-coder-14b. PR: https://github.com/ShishirPatil/gorilla/pull/1332.}
}

Related

Downloads last month
1,479
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dcostenco/prism-coder-7b

Base model

Qwen/Qwen2.5-7B
Quantized
(182)
this model