Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Paper
•
2311.03099
•
Published
•
30
This is a merge of pre-trained language models created using mergekit.
This model was merged using the DARE TIES merge method using zelk12/MT-Gen6fix-gemma-2-9B as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: zelk12/MT-Gen6fix-gemma-2-9B
#no parameters necessary for base model
- model: TheDrummer/Tiger-Gemma-9B-v3
parameters:
density: 0.8
weight: 0.5
- model: Sorawiz/Gemma-9B-Chat
parameters:
density: 0.68
weight: 0.5
- model: IlyaGusev/gemma-2-9b-it-abliterated
parameters:
density: 0.61
weight: 0.5
- model: zelk12/MT-Merge6-gemma-2-9B
parameters:
density: 0.5
weight: 0.5
- model: zelk12/MT1-Gen7-gemma-2-9B
parameters:
density: 0.5
weight: 0.5
merge_method: dare_ties
base_model: zelk12/MT-Gen6fix-gemma-2-9B
parameters:
normalize: true
dtype: bfloat16