metadata
license: other
license_name: taide-l-models-community-license-agreement
license_link: https://drive.google.com/file/d/1ICTxogjS9Bc2O3K1P9ZauQYVoruT13n5/view
extra_gated_heading: 您需要先同意授權條款才能使用此模型
extra_gated_fields:
姓名(Name): text
生日(Date of birth): date_picker
國家(Country): country
所屬單位(Affiliation): text
geo: ip_location
按下送出表示您同意社群授權同意書與個人資料蒐集告知聲明(By clicking Submit below I accept the terms of the license and privacy policy): checkbox
extra_gated_prompt: >-
* ### [(Llama 版次)-TAIDE
模型授權條款](https://drive.google.com/file/d/1ICTxogjS9Bc2O3K1P9ZauQYVoruT13n5/view)
* ### [個人資料蒐集告知聲明(Privacy
policy)](https://drive.google.com/file/d/1MfYktH3jBK61YVA1yBLruU7nZlKWFYGd/view)
extra_gated_button_content: 送出(Submit)
The license is inherited from the TAIDE Model.
This is an Eagle3 model for Llama-3.1-TAIDE-LX-8B-Chat, trained on custom sharegpt_gpt4 dataset, and for inferencing using sglang.
Following benchmark was ran with this benchmarking file and these settings:
A single H100 GPU
dtype:float16attention-backend:flashinfermem-fraction-static:0.8max-total-tokens:131072cuda-graph-max-bs:32speculative decoding related:
speculative-algorithm:EAGLE3speculative-num-steps:3speculative-eagle-topk:24speculative-num-draft-tokens:128
num_prompts:1Lhs: Baseline
Rhs: Eagle3
Achieving around 1.56x bump in inferencing speed
