Instructions to use dealignai/MiniMax-M2.7-JANG_2L-CRACK with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dealignai/MiniMax-M2.7-JANG_2L-CRACK with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("dealignai/MiniMax-M2.7-JANG_2L-CRACK")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps
LM Studio

Pi new

How to use dealignai/MiniMax-M2.7-JANG_2L-CRACK with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "dealignai/MiniMax-M2.7-JANG_2L-CRACK"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "dealignai/MiniMax-M2.7-JANG_2L-CRACK"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use dealignai/MiniMax-M2.7-JANG_2L-CRACK with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "dealignai/MiniMax-M2.7-JANG_2L-CRACK"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default dealignai/MiniMax-M2.7-JANG_2L-CRACK

Run Hermes

hermes

MLX LM

How to use dealignai/MiniMax-M2.7-JANG_2L-CRACK with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "dealignai/MiniMax-M2.7-JANG_2L-CRACK"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "dealignai/MiniMax-M2.7-JANG_2L-CRACK"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "dealignai/MiniMax-M2.7-JANG_2L-CRACK",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Important: This model uses the JANG quantization format -- the GGUF equivalent for MLX on Apple Silicon. Currently only supported by MLX Studio and the jang-tools Python package. Follow @dealignai for new releases.

MLX Studio -- the only app that natively supports JANG models

MiniMax M2.7 -- JANG_2L + CRACK

JANG mixed-precision | CRACK abliterated | Reasoning-only | 63 GB

What Is This?

This is MiniMax M2.7 -- a 230B parameter Mixture-of-Experts reasoning model with 256 experts (8 active per token), all standard attention, and always-on chain-of-thought reasoning.

It has been:

JANG quantized -- JANG_2L profile (8-bit attention, 6-bit embeddings, 2-bit experts) -- 63 GB
CRACK abliterated -- permanent weight-level removal of safety refusal


Architecture	MiniMax M2.7 MoE -- 230B total, ~10B active, 256 experts
Quantization	JANG_2L (8/6/2-bit mixed) -- 63 GB
Abliteration	CRACK abliterated
MMLU-200	84.0% (base: 88.0%, delta: -4.0%)
HarmBench-320	83.4% overall, 93.7% excluding copyright
Compliance	8/8 quick test
Reasoning	Always ON (chain-of-thought)
Speed	~47 tok/s (M4 Ultra 256 GB)
Fits on	96 GB+ Macs

MMLU-200 Results

Subject	CRACK	Base	Delta
World Religions	19/20 (95%)	19/20	0
Astronomy	18/20 (90%)	19/20	-1
High School Biology	18/20 (90%)	19/20	-1
High School Mathematics	18/20 (90%)	18/20	0
Logical Fallacies	18/20 (90%)	19/20	-1
Abstract Algebra	17/20 (85%)	16/20	+1
College Physics	16/20 (80%)	16/20	0
High School Chemistry	16/20 (80%)	16/20	0
Anatomy	14/20 (70%)	17/20	-3
College Computer Science	14/20 (70%)	17/20	-3
Total	168/200 (84.0%)	176/200 (88.0%)	-4.0%

HarmBench-320 Results

Category	Score
Harmful	18/18	100%
Cybercrime / Intrusion	50/52	96.2%
Misinformation / Disinfo	51/54	94.4%
Chemical / Biological	39/42	92.9%
Harassment / Bullying	19/21	90.5%
Illegal	45/53	84.9%
Copyright	45/80	56.2%

Note on copyright: M2.7's base model has strong copyright training and refuses to reproduce copyrighted books/lyrics regardless of abliteration. This is a base model limitation, not a surgery result.

JANG CRACK M2.7 Series

Model	Avg Bits	Size	MMLU	HarmBench	Speed	Fits on
JANG_2L + CRACK	2.1	63 GB	84.0%	83.4%	~47 t/s	96 GB Mac
JANG_3L + CRACK	3.08	89 GB	93.5%	79.1%	~46 t/s	128 GB Mac

vs MLX Uniform Quantization

MLX uniform quantization is completely broken on MiniMax at ALL bit levels (~25% MMLU = random chance). JANG is the only working quantization format for this architecture.

Install & Usage

pip install "jang[mlx]"

from jang_tools import load_for_inference
from mlx_lm import generate
from mlx_lm.sample_utils import make_sampler

model, tokenizer = load_for_inference("dealignai/MiniMax-M2.7-JANG_2L-CRACK")
sampler = make_sampler(temp=1.0)  # MiniMax requires temp=1.0 for chat

messages = [{"role": "user", "content": "Your prompt here"}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=False)

response = generate(model, tokenizer, prompt=prompt, max_tokens=4000, sampler=sampler)
print(response)

Note: M2.7 is a reasoning-only model -- it always generates a <think> chain before answering. Use max_tokens=4000+ for complex questions. For chat, use temperature=1.0 (greedy causes infinite loops).

About JANG

JANG (Jang Adaptive N-bit Grading) is a mixed-precision quantization format for Apple Silicon -- the GGUF equivalent for MLX. Classifies tensors into sensitivity tiers and assigns bits accordingly.

About CRACK

CRACK (Controlled Refusal Ablation via Calibrated Knockouts) removes safety alignment from LLMs at the weight level, achieving compliance while preserving reasoning quality.

Disclaimer

This model is provided for research and educational purposes. The creators are not responsible for any misuse. By downloading this model, you agree to use it responsibly and in compliance with applicable laws.

Created by Jinho Jang

Downloads last month: 1,198

Safetensors

Model size

19B params

Tensor type

U32

F16

MLX

Hardware compatibility

Quantized

Model tree for dealignai/MiniMax-M2.7-JANG_2L-CRACK

Base model

MiniMaxAI/MiniMax-M2.7

Finetuned

(26)

this model

Collection including dealignai/MiniMax-M2.7-JANG_2L-CRACK

High Quality Uncensored - GGUF on MLX

Collection

These are the empirically proven highest quality uncensored models on MLX. • 26 items • Updated 27 days ago • 29

dealignai
/

MiniMax-M2.7-JANG_2L-CRACK