The Thiomi Dataset: A Large-Scale Multimodal Corpus for Low-Resource African Languages Paper • 2603.29244 • Published Mar 31 • 1
Zero-Shot Morphological Discovery in Low-Resource Bantu Languages via Cross-Lingual Transfer and Unsupervised Clustering Paper • 2604.22723 • Published 12 days ago
Neural Recovery of Historical Lexical Structure in Bantu Languages from Modern Data Paper • 2604.22730 • Published 12 days ago
Attention Sinks in Massively Multilingual Neural Machine Translation:Discovery, Analysis, and Mitigation Paper • 2605.01229 • Published 4 days ago
Continued Pretraining for Low-Resource Swahili ASR: Achieving State-of-the-Art Performance with Minimal Labeled Data Paper • 2603.11378 • Published Mar 11