Prism-Coder 7B (v18clean-epoch0) — Function Calling + AAC Multitask
A fine-tuned Qwen2.5-Coder-7B-Instruct for the Synalux platform: function calling for the Prism Memory MCP server, AAC (Augmentative and Alternative Communication) tasks for the PrismAAC consumer app, and multi-language clinical/communication assistance.
Released 2026-05-04, replacing the prior v18aac-MAX production model. Sibling: prism-coder-14b for paid-tier medium queries.
What changed vs the previous prod (v18aac-MAX)
| Metric (Prism internal eval, 3-run StdDev 0%) | Previous prod | v18clean-epoch0 | Δ |
|---|---|---|---|
| BFCL (Prism 64-test) | 47.2% | 88.1% | +40.9pp |
| AAC realigned (48 cases) | 47/48 (97.9%) | 47/48 (97.9%) | held |
| Caregiver targeted (boost_word + reorder) | 19/20 | 20/20 | +1 |
| emergency_qa | 13/13 | 13/13 | held |
| text_correct | 15/15 | 15/15 | held |
| translate | 7/8 | 8/8 | +1 |
| ask_ai | 5/5 | 5/5 | held |
The previous prod had over-fit to AAC on the BFCL axis. v18clean-epoch0 is full SFT from clean Qwen2.5-Coder-7B-Instruct with a balanced curriculum: 5,450 BFCL + 1,500 caregiver + 294 text_correct + 250 emergency + 200 format-anchor + 20 ask_ai. Single epoch, LR 1e-5 cosine, DoRA r=128 / alpha=256.
Official Berkeley BFCL V4
A handler PR is open against the Gorilla repo: ShishirPatil/gorilla#1332. Local self-run via the official toolkit is in progress; partial results so far:
- Non-Live AST: 79.48% (Simple Python 92.25%, Multiple 90%, Parallel 79%, ParallelMultiple 82.5%, Java 49%, JavaScript 58%)
- Live AST: 73.28%
- Relevance Detection: 100%
- Irrelevance Detection: 64.58%
Full Overall + Multi-Turn / Memory / Web Search categories will be appended once the run completes. Public leaderboard listing pending PR merge.
Use cases
PrismAAC consumer app
On-device function-calling backbone for an AAC app used by nonverbal children, adults with motor impairments, and BCBA caregivers. Drives:
- Caregiver instruction parsing into structured AAC actions (
add_phrase,boost_word,reorder_phrase,note_only, etc.) - Text correction for users with motor-impaired typing
- Emergency QA (911-call assistance for nonverbal users — 13/13 operator-question accuracy)
- Translation across 12 languages
- Open-ended Q&A scoped to AAC vocabulary
Synalux portal
Tier-aware routing handles standard/advanced/enterprise traffic locally on this 7B (free queries route to cloud Gemini for complex). At 10K-user scale this saves an estimated $190K-210K/year vs all-cloud routing.
Format
This repo provides:
model.safetensors— single-file BF16 (15.2 GB), loadable viatransformers, vLLM, sglangconfig.json,generation_config.json,tokenizer.json,chat_template.jinja— standard HF assetsmerged/— earlier sharded snapshot, kept for reproducibility (will be removed when this README is finalized)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("dcostenco/prism-coder-7b")
m = AutoModelForCausalLM.from_pretrained(
"dcostenco/prism-coder-7b",
torch_dtype=torch.bfloat16,
device_map="auto",
)
prompt = tok.apply_chat_template(
[{"role": "user", "content": "Add 'eat apples' to the food category."}],
tokenize=False,
add_generation_prompt=True,
)
inputs = tok(prompt, return_tensors="pt").to(m.device)
out = m.generate(**inputs, max_new_tokens=120, temperature=0.3)
print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
For Ollama users, a Q4_K_M GGUF is available via the prism-coder:7b tag in the Synalux ops fleet (build instructions in the PR).
Training
- Base:
Qwen/Qwen2.5-Coder-7B-Instruct - Method: Full DoRA SFT from clean base (NOT continuation of any prior fine-tune)
- Adapter: r=128, alpha=256, lora_dropout=0.05, target_modules q/k/v/o/gate/up/down_proj
- Schedule: 1 epoch, LR 1e-5 cosine, warmup 5%, batch 32 effective (per_device 4 × grad_accum 8)
- Curriculum: 5,450 BFCL + 1,500 caregiver + 294 text_correct + 250 emergency + 200 format-anchor + 20 ask_ai
- Compute: H100×2 on Modal, ~3h total
- Ablation: trained 4 epochs with per-epoch saves; epoch_0 (this checkpoint) was Pareto-optimal — later epochs over-fit caregiver while losing BFCL
License
Apache 2.0. Free for research and commercial use.
Citation
@misc{prism-coder-7b-v18clean-2026,
title = {Prism-Coder 7B v18clean-epoch0: Balanced Function Calling + AAC Fine-Tune of Qwen2.5-Coder-7B},
author = {Synalux AI / Dmitri Costenco},
year = {2026},
month = {May},
url = {https://huggingface.co/dcostenco/prism-coder-7b},
note = {Companion 14B model: https://huggingface.co/dcostenco/prism-coder-14b. PR: https://github.com/ShishirPatil/gorilla/pull/1332.}
}
Related
- Companion 14B model:
dcostenco/prism-coder-14b - Berkeley BFCL V4 PR:
ShishirPatil/gorilla#1332 - Synalux portal: synalux.ai
- PrismAAC consumer app: github.com/dcostenco/prism-aac
- Downloads last month
- 1,479