nvidia
/

Nemotron-H-8B-Reasoning-128K

Text Generation

Model card Files Files and versions

bkartal commited on Jun 6

Commit

cf7f23a

·

verified ·

1 Parent(s): 43c0df9

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -220,7 +220,7 @@ We follow the jinja chat template provided below. This template conditionally ad
 ## Training, Testing, and Evaluation Datasets
-The post-training corpus for Nemotron-H-8B-Reasoning-128K consists of English and multilingual text (German, Spanish, French, Italian, Korean, Portuguese, Russian, Japanese, Chinese and English). Our sources cover a variety of document types such as: webpages, dialogue, articles, and other written materials. The corpus spans domains including code, legal, math, science, finance, and more. We also include a small portion of question-answering, and alignment style data to improve model accuracies. For several of the domains listed above we used synthetic data, specifically reasoning traces, from R1.
 **Data Collection for Training & Testing Datasets:** Hybrid: Automated, Human, Synthetic

 ## Training, Testing, and Evaluation Datasets
+The post-training corpus for Nemotron-H-8B-Reasoning-128K consists of English and multilingual text (German, Spanish, French, Italian, Korean, Portuguese, Russian, Japanese, Chinese and English). Our sources cover a variety of document types such as: webpages, dialogue, articles, and other written materials. The corpus spans domains including code, legal, math, science, finance, and more. We also include a small portion of question-answering, and alignment style data to improve model accuracies. For several of the domains listed above we used synthetic data, specifically reasoning traces, from DeepSeek R1.
 **Data Collection for Training & Testing Datasets:** Hybrid: Automated, Human, Synthetic