You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

GenAI G11n Model Assessment

Overview

This repository contains the manual evaluation framework for assessing multilingual and culturally adaptive capabilities in GenAI models. The current implementation focuses on evaluating AI systems using success criteria across instruction-following, translation fidelity, linguistic accuracy, and multimodal consistency.

Evaluation Scope

The model is evaluated across the following main categories:

  • Language & Grammar
  • Cultural Adaptation
  • Instruction & Response Coherence
  • Multimodal Consistency

Each category includes granular subcategories. See the latest version of Model Template.xlsx for a detailed breakdown.

Documents

File Description
Model Template Lists all evaluation criteria and subcategories
Locale Includes all the related documentation to that specific locale
Prompts Datasets Lists all localized prompts to be used during assessment
Evaluation Results Includes the results per model applied to each of the evaluated GenAIs

Models Evaluated

  • ChatGPT (4o)
  • Gemini (2.0 Flash)
  • Copilot (4o)
  • DeepSeek (V3)

Evaluation Process

Each prompt is evaluated by at least 3 reviewers. Final agreement is consolidated across reviewers using a shared sheet. See structure in test results CSV and steps in eval instructive.

Licensing

This evaluation framework is released under the MIT License. You are free to use, adapt, and extend it with attribution.

Maintainers

  • Andres Castillo – G11n QA
  • Edgar Castillo – G11n QA
  • Patricia Oceguera – Linguistic Advisor
  • Marcela Salgado – Review Support
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results