DuckDB-NSQL-7B-v0.1-mlx
This is an MLX-optimized version of motherduckdb/DuckDB-NSQL-7B-v0.1, converted for efficient inference on Apple Silicon (M1/M2/M3/M4).
Model Description
DuckDB-NSQL-7B is a 7-billion parameter language model fine-tuned for generating DuckDB SQL queries from natural language questions. This MLX conversion provides significant performance improvements on Apple Silicon Macs.
Conversion Details
- Base Model: motherduckdb/DuckDB-NSQL-7B-v0.1
- Precision: Float16 (FP16)
- Framework: MLX
- Optimized for: Apple Silicon (M1/M2/M3/M4)
- Model Size: ~13.5 GB
- Converted by: aikhan1
Installation
pip install mlx-lm
Usage
Basic Inference
from mlx_lm import load, generate
# Load the model
model, tokenizer = load("aikhan1/DuckDB-NSQL-7B-v0.1-mlx")
# Example schema
schema = """
CREATE TABLE hospitals (
hospital_id BIGINT PRIMARY KEY,
hospital_name VARCHAR,
region VARCHAR,
bed_capacity INTEGER
);
CREATE TABLE patients (
patient_id BIGINT PRIMARY KEY,
full_name VARCHAR,
gender VARCHAR,
date_of_birth DATE,
region VARCHAR
);
"""
# Example question
question = "How many patients are there in each region?"
# Build prompt
prompt = f"""You are an assistant that writes valid DuckDB SQL queries.
### Schema:
{schema}
### Question:
{question}
### Response (DuckDB SQL only):"""
# Generate SQL
response = generate(model, tokenizer, prompt=prompt, max_tokens=200, temp=0.0)
print(response)
Using MLX Server
# Start the server
mlx_lm.server --model aikhan1/DuckDB-NSQL-7B-v0.1-mlx --port 8080
# In another terminal, make requests
curl -X POST http://localhost:8080/v1/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "CREATE TABLE patients(...)\n\nQuestion: Count patients by region\n\nSQL:",
"max_tokens": 200,
"temperature": 0
}'
Performance Comparison
On Apple Silicon M-series chips, MLX provides:
- Faster inference compared to PyTorch CPU
- Lower memory usage through efficient memory management
- Native acceleration using Metal Performance Shaders
Typical generation speed: ~30-50 tokens/second on M1 Pro/Max, ~50-80 tokens/second on M2/M3 series.
Prompt Format
The model expects prompts in this format:
You are an assistant that writes valid DuckDB SQL queries.
### Schema:
CREATE TABLE table_name (
column1 TYPE,
column2 TYPE
);
### Question:
[Your natural language question]
### Response (DuckDB SQL only):
Limitations
- The model is trained specifically for DuckDB SQL syntax
- Complex queries may require post-processing
- The model may occasionally generate invalid SQL for complex schemas
- Best performance on well-defined schemas with clear column names
License
This model inherits the Llama 2 Community License Agreement from the base model.
Citation
@misc{duckdb-nsql-mlx,
title={DuckDB-NSQL-7B MLX Conversion},
author={aikhan1},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/aikhan1/DuckDB-NSQL-7B-v0.1-mlx}}
}
Original model:
@misc{duckdb-nsql,
title={DuckDB-NSQL-7B: Natural Language to SQL for DuckDB},
author={MotherDuck},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/motherduckdb/DuckDB-NSQL-7B-v0.1}}
}
Acknowledgments
- Original model by MotherDuck
- MLX framework by Apple ML Research
- Converted using mlx-lm
- Nuxera AI
- Downloads last month
- 21
Model tree for aikhan1/DuckDB-NSQL-7B-v0.1-mlx
Base model
meta-llama/Llama-2-7b
Finetuned
motherduckdb/DuckDB-NSQL-7B-v0.1