Llama-3.2-3B-Executorch-Q8DA4W

This repository contains the llama3_2_3b_q8da4w.pte model, exported for use with ExecuTorch.

Details

  • Model: Llama 3.2 Instruct
  • Format: .pte (ExecuTorch)
  • Quantization: Llama 3.2 3B Instruct model exported for ExecuTorch with Q8DA4W (4-bit weights, 8-bit dynamic activations). Compatible with React Native.

Usage

This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or react-native-executorch.

  1. Download tokenizer.model and llama3_2_3b_q8da4w.pte.
  2. Place them in your app's asset folder.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support