Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

App Files Files Community

VibecoderMcSwaggins commited on 15 days ago

Commit

4927db5

1 Parent(s): 75bc69f

fix: align all docs/tests to gpt-5.1 (actual current model)

Browse files

Files changed (5) hide show

AGENTS.md +2 -3
CLAUDE.md +2 -3
GEMINI.md +2 -3
docs/bugs/INVESTIGATION_INVALID_MODELS.md +13 -12
tests/unit/agent_factory/test_judges_factory.py +3 -3

AGENTS.md CHANGED Viewed

@@ -93,9 +93,8 @@ DeepBonerError (base)
 Given the rapid advancements, as of November 29, 2025, the DeepBoner project uses the following default LLM models in its configuration (`src/utils/config.py`):
-- **OpenAI:** `gpt-5`
-  - This is the stable flagship model released in August 2025.
-  - While `gpt-5.1` (released November 2025) exists, it is currently gated, and attempts to use it resulted in a `403 model_not_found` error for typical API keys. Advanced users with access to `gpt-5.1-instant`, `gpt-5.1-thinking`, or `gpt-5.1-codex-max` may configure their `.env` accordingly.
 - **Anthropic:** `claude-sonnet-4-5-20250929`
   - This is the mid-range Claude 4.5 model, released on September 29, 2025.
   - The flagship `Claude Opus 4.5` (released November 24, 2025) is also available and can be configured by advanced users for enhanced capabilities.

 Given the rapid advancements, as of November 29, 2025, the DeepBoner project uses the following default LLM models in its configuration (`src/utils/config.py`):
+- **OpenAI:** `gpt-5.1`
+  - Current flagship model (November 2025). Requires Tier 5 access.
 - **Anthropic:** `claude-sonnet-4-5-20250929`
   - This is the mid-range Claude 4.5 model, released on September 29, 2025.
   - The flagship `Claude Opus 4.5` (released November 24, 2025) is also available and can be configured by advanced users for enhanced capabilities.

CLAUDE.md CHANGED Viewed

@@ -100,9 +100,8 @@ DeepBonerError (base)
 Given the rapid advancements, as of November 29, 2025, the DeepBoner project uses the following default LLM models in its configuration (`src/utils/config.py`):
-- **OpenAI:** `gpt-5`
-  - This is the stable flagship model released in August 2025.
-  - While `gpt-5.1` (released November 2025) exists, it is currently gated, and attempts to use it resulted in a `403 model_not_found` error for typical API keys. Advanced users with access to `gpt-5.1-instant`, `gpt-5.1-thinking`, or `gpt-5.1-codex-max` may configure their `.env` accordingly.
 - **Anthropic:** `claude-sonnet-4-5-20250929`
   - This is the mid-range Claude 4.5 model, released on September 29, 2025.
   - The flagship `Claude Opus 4.5` (released November 24, 2025) is also available and can be configured by advanced users for enhanced capabilities.

 Given the rapid advancements, as of November 29, 2025, the DeepBoner project uses the following default LLM models in its configuration (`src/utils/config.py`):
+- **OpenAI:** `gpt-5.1`
+  - Current flagship model (November 2025). Requires Tier 5 access.
 - **Anthropic:** `claude-sonnet-4-5-20250929`
   - This is the mid-range Claude 4.5 model, released on September 29, 2025.
   - The flagship `Claude Opus 4.5` (released November 24, 2025) is also available and can be configured by advanced users for enhanced capabilities.

GEMINI.md CHANGED Viewed

@@ -74,9 +74,8 @@ Settings via pydantic-settings from `.env`:
 Given the rapid advancements, as of November 29, 2025, the DeepBoner project uses the following default LLM models in its configuration (`src/utils/config.py`):
-- **OpenAI:** `gpt-5`
-  - This is the stable flagship model released in August 2025.
-  - While `gpt-5.1` (released November 2025) exists, it is currently gated, and attempts to use it resulted in a `403 model_not_found` error for typical API keys. Advanced users with access to `gpt-5.1-instant`, `gpt-5.1-thinking`, or `gpt-5.1-codex-max` may configure their `.env` accordingly.
 - **Anthropic:** `claude-sonnet-4-5-20250929`
   - This is the mid-range Claude 4.5 model, released on September 29, 2025.
   - The flagship `Claude Opus 4.5` (released November 24, 2025) is also available and can be configured by advanced users for enhanced capabilities.

 Given the rapid advancements, as of November 29, 2025, the DeepBoner project uses the following default LLM models in its configuration (`src/utils/config.py`):
+- **OpenAI:** `gpt-5.1`
+  - Current flagship model (November 2025). Requires Tier 5 access.
 - **Anthropic:** `claude-sonnet-4-5-20250929`
   - This is the mid-range Claude 4.5 model, released on September 29, 2025.
   - The flagship `Claude Opus 4.5` (released November 24, 2025) is also available and can be configured by advanced users for enhanced capabilities.

docs/bugs/INVESTIGATION_INVALID_MODELS.md CHANGED Viewed

@@ -9,22 +9,23 @@
 ## Issue Description
 The user encountered a 403 error when running in Magentic mode:
-`Error code: 403 - {'error': {'message': 'Project ... does not have access to model gpt-5.1', ... 'code': 'model_not_found'}}`
-This indicates the application is trying to use `gpt-5.1`, which the user's API key did not have access to (likely a beta/gated model).
 ## Root Cause Analysis
-The default config used `gpt-5.1` (beta/preview) and `claude-sonnet-4-5-20250929`.
-Initial remediation mistakenly downgraded these to 2024 models (`gpt-4o`).
-Web search confirmed that in November 2025:
-- `claude-sonnet-4-5-20250929` IS valid.
-- `gpt-5.1` exists but access is restricted (leading to 403).
-- `gpt-5` (August 2025) is the stable flagship.
 ## Solution Implemented
 Updated `src/utils/config.py` to use:
-- `anthropic_model`: `claude-sonnet-4-5-20250929` (Restored correct Nov 2025 model)
-- `openai_model`: `gpt-5` (Changed from 5.1 to 5 to ensure stability/access).
 ## Verification
-- `tests/unit/agent_factory/test_judges_factory.py` updated and passed.

 ## Issue Description
 The user encountered a 403 error when running in Magentic mode:
+`Error code: 403 - {'error': {'message': 'Project ... does not have access to model gpt-5', ... 'code': 'model_not_found'}}`
 ## Root Cause Analysis
+OpenAI deprecated the base `gpt-5` model. Tier 5 accounts now have access to:
+- `gpt-5.1` (current flagship)
+- `gpt-5-mini`
+- `gpt-5-nano`
+- `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`
+- `o3`, `o4-mini`
+The base `gpt-5` is NO LONGER available via API.
 ## Solution Implemented
 Updated `src/utils/config.py` to use:
+- `openai_model`: `gpt-5.1` (the actual current model)
+- `anthropic_model`: `claude-sonnet-4-5-20250929` (unchanged)
 ## Verification
+- `tests/unit/agent_factory/test_judges_factory.py` updated and passed.
+- User confirmed Tier 5 access to `gpt-5.1` via OpenAI dashboard.

tests/unit/agent_factory/test_judges_factory.py CHANGED Viewed

@@ -25,11 +25,11 @@ def test_get_model_openai(mock_settings):
     """Test that OpenAI model is returned when provider is openai."""
     mock_settings.llm_provider = "openai"
     mock_settings.openai_api_key = "sk-test"
-    mock_settings.openai_model = "gpt-5"
     model = get_model()
     assert isinstance(model, OpenAIChatModel)
-    assert model.model_name == "gpt-5"
 def test_get_model_anthropic(mock_settings):
@@ -58,7 +58,7 @@ def test_get_model_default_fallback(mock_settings):
     """Test fallback to OpenAI if provider is unknown."""
     mock_settings.llm_provider = "unknown_provider"
     mock_settings.openai_api_key = "sk-test"
-    mock_settings.openai_model = "gpt-5"
     model = get_model()
     assert isinstance(model, OpenAIChatModel)

     """Test that OpenAI model is returned when provider is openai."""
     mock_settings.llm_provider = "openai"
     mock_settings.openai_api_key = "sk-test"
+    mock_settings.openai_model = "gpt-5.1"
     model = get_model()
     assert isinstance(model, OpenAIChatModel)
+    assert model.model_name == "gpt-5.1"
 def test_get_model_anthropic(mock_settings):
     """Test fallback to OpenAI if provider is unknown."""
     mock_settings.llm_provider = "unknown_provider"
     mock_settings.openai_api_key = "sk-test"
+    mock_settings.openai_model = "gpt-5.1"
     model = get_model()
     assert isinstance(model, OpenAIChatModel)