TL;DR We are proudly announcing that the SoTA Llama-3_1-Nemotron-Ultra-253B is now available for Apple Silicon 192gb+ machines thanks to mlx>=0.26.1 and newest mlx-lm runtime! Enjoy! LibraxisAI/Llama-3_1-Nemotron-Ultra-253B-v1-MLX-Q5
All the details covered in README.md file (As we hope). You can try to run on LMStudio if you are not comfortable with the native mlx_lm.server runner using our modified .jinja template and custom tokenizer_config.json.