|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- code-search-net/code_search_net |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: fill-mask |
|
|
tags: |
|
|
- code |
|
|
- python |
|
|
- java |
|
|
- javascript |
|
|
- go |
|
|
- ruby |
|
|
- php |
|
|
--- |
|
|
|
|
|
# CodeModernBERT-Finch |
|
|
|
|
|
This model is a code-specific pretrained model created solely using the CodeSearchNet dataset. It supports six languages included in CodeSearchNet.\ |
|
|
For a version fine-tuned specifically for code search tasks, please refer to [Shuu12121/CodeSearch-ModernBERT-Finch](https://huggingface.co/Shuu12121/CodeSearch-ModernBERT-Finch). |
|
|
|
|
|
## Architecture |
|
|
|
|
|
* Base: ModernBERT-style encoder |
|
|
* Hidden size: 512 |
|
|
* Layers: 6 |
|
|
* Attention heads: 6 |
|
|
* Parameters: \~50M |
|
|
* Pretraining: Masked Language Modeling (MLM) |
|
|
* Fine-tuning: Domain-specific code tasks |
|
|
|
|
|
The results below were obtained by randomly sampling 10,000 examples per language from the CodeSearchNet dataset, training them in a Sentence-BERT fashion, and evaluating on the MTEB CodeSearchNetRetrieval benchmark. |
|
|
All models listed in the table below were fine-tuned using the same approach. Those marked with 200 and the Finch models were trained with a Multiple Negatives Ranking Loss batch size of 200. Others were trained with a batch size of 40 (because larger batches could not fit into memory).\ |
|
|
Finch-SmallBatch was trained with a smaller batch size of 40 to create a comparison model against the standard Finch models trained with batch size 200. |
|
|
|
|
|
| Model | go | java | javascript | php | python | ruby | |
|
|
| ---------------------------------- | ----- | ----- | ---------- | ----- | ------ | ----- | |
|
|
| Finch(40M) | 0.934 | 0.784 | 0.728 | 0.835 | 0.865 | 0.756 | |
|
|
| Finch-Pre(40M) | 0.937 | 0.705 | 0.685 | 0.828 | 0.843 | 0.725 | |
|
|
| Finch-SmallBatch(40M) | 0.930 | 0.765 | 0.707 | 0.825 | 0.859 | 0.748 | |
|
|
| ModernBERT-base-Finetuned(149M) | 0.933 | 0.779 | 0.748 | 0.839 | 0.885 | 0.794 | |
|
|
| Owl-4.1-Small-Fine-tuned(151M) | 0.942 | 0.780 | 0.729 | 0.843 | 0.893 | 0.772 | |
|
|
| Owl-4.1-Small-Fine-tuned-200(151M) | 0.943 | 0.850 | 0.747 | 0.858 | 0.894 | 0.802 | |
|
|
| CodeBERT-Fine-tuned(125M) | 0.932 | 0.708 | 0.709 | 0.828 | 0.870 | 0.772 | |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|