Llama-3.2-3B-Executorch-Q8DA4W
This repository contains the llama3_2_3b_q8da4w.pte model, exported for use with ExecuTorch.
Details
- Model: Llama 3.2 Instruct
- Format:
.pte(ExecuTorch) - Quantization: Llama 3.2 3B Instruct model exported for ExecuTorch with Q8DA4W (4-bit weights, 8-bit dynamic activations). Compatible with React Native.
Usage
This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or react-native-executorch.
- Download
tokenizer.modelandllama3_2_3b_q8da4w.pte. - Place them in your app's asset folder.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support