updated model card
Browse files
README.md
CHANGED
|
@@ -19,7 +19,9 @@ Munin-7B-open-pt is a base model that can be used a the starting point for fine-
|
|
| 19 |
## Training details
|
| 20 |
Munin-7B-open-pt has been trained using the [maester](https://github.com/rlrs/maester/tree/main/3aca26960eaa1a16250b3feda40303c240ba4ca1) framework developed as part of the [Danish Foundation Models project](https://foundationmodels.dk/). All training was performed on a single 8x Nvidia B200 node (the first of its kind in Denmark).
|
| 21 |
|
| 22 |
-
The training was performed in three stages, with data mix (open-stageK.py) and maester (open-stageK.toml) configuration files available in each subfolder. The
|
|
|
|
|
|
|
| 23 |
|
| 24 |
| Stage | Batch size | Steps | HF path | Data mix | Comments |
|
| 25 |
|-|-|-|-|-|-|
|
|
|
|
| 19 |
## Training details
|
| 20 |
Munin-7B-open-pt has been trained using the [maester](https://github.com/rlrs/maester/tree/main/3aca26960eaa1a16250b3feda40303c240ba4ca1) framework developed as part of the [Danish Foundation Models project](https://foundationmodels.dk/). All training was performed on a single 8x Nvidia B200 node (the first of its kind in Denmark).
|
| 21 |
|
| 22 |
+
The training was performed in three stages, with data mix (open-stageK.py) and maester (open-stageK.toml) configuration files available in each subfolder. The datasets can be created using the create_dataset.py script provided in this repository.
|
| 23 |
+
|
| 24 |
+
The characteristics of the three pre-training stages are detailed in the following table:
|
| 25 |
|
| 26 |
| Stage | Batch size | Steps | HF path | Data mix | Comments |
|
| 27 |
|-|-|-|-|-|-|
|