|
|
--- |
|
|
title: Roger Intelligence Platform |
|
|
emoji: ⚡ |
|
|
colorFrom: blue |
|
|
colorTo: green |
|
|
sdk: docker |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# 🇱🇰 Roger Intelligence Platform |
|
|
|
|
|
**Real-Time Situational Awareness for Sri Lanka** |
|
|
|
|
|
A multi-agent AI system that aggregates intelligence from **50+ data sources** to provide risk analysis and opportunity detection for businesses operating in Sri Lanka. |
|
|
|
|
|
## 🌐 Live Demo |
|
|
|
|
|
| Component | URL | |
|
|
|-----------|-----| |
|
|
| **Frontend Dashboard** | [https://model-x-frontend-snowy.vercel.app/](https://model-x-frontend-snowy.vercel.app/) | |
|
|
| **Backend API** | [https://nivakaran-Roger.hf.space](https://nivakaran-Roger.hf.space) | |
|
|
|
|
|
--- |
|
|
|
|
|
## 🎯 Key Features |
|
|
|
|
|
✅ **5 Domain Agents + 2 Orchestrators** running in parallel: |
|
|
- **Social Agent** - Reddit, Twitter, Facebook, Threads, BlueSky monitoring |
|
|
- **Political Agent** - Gazette, Parliament, District Social Media |
|
|
- **Economical Agent** - CSE Stock Market + Technical Indicators (SMA, EMA, RSI, MACD) |
|
|
- **Meteorological Agent** - DMC Weather + RiverNet + **FloodWatch Integration** |
|
|
- **Intelligence Agent** - Brand Monitoring + Threat Detection + **User-Configurable Targets** |
|
|
- **Combined Agent (Orchestrator)** - Fan-out/Fan-in coordination, LLM filtering, feed ranking |
|
|
- **Data Retrieval Agent** - Web scraping orchestration with anti-bot features |
|
|
|
|
|
✅ **Situational Awareness Dashboard**: |
|
|
- **CEB Power Status** - Load shedding / power outage monitoring |
|
|
- **Fuel Prices** - Petrol 92/95, Diesel, Kerosene (CEYPETCO) |
|
|
- **CBSL Economic Indicators** - Inflation, policy rates, forex reserves, USD/LKR |
|
|
- **Health Alerts** - Dengue case tracking, disease outbreak monitoring |
|
|
- **Commodity Prices** - 15 essential goods (rice, sugar, gas, eggs, etc.) |
|
|
- **Water Supply Status** - NWSDB disruption alerts |
|
|
|
|
|
✅ **ML Anomaly Detection Pipeline** (Integrated into Graph): |
|
|
- Language-specific BERT models (Sinhala, Tamil, English) |
|
|
- Real-time anomaly inference on every graph cycle |
|
|
- Clustering (DBSCAN, KMeans, HDBSCAN) |
|
|
- Anomaly Detection (Isolation Forest, LOF) |
|
|
- MLflow + DagsHub tracking |
|
|
|
|
|
✅ **Weather Prediction ML Pipeline**: |
|
|
- LSTM Neural Network (30-day sequences) |
|
|
- Predicts: Temperature, Rainfall, Flood Risk, Severity |
|
|
- 21 weather stations → 25 districts |
|
|
- Airflow DAG runs daily at 4 AM |
|
|
|
|
|
✅ **Currency Prediction ML Pipeline**: |
|
|
- GRU Neural Network (optimized for 8GB RAM) |
|
|
- Predicts: USD/LKR exchange rate |
|
|
- Features: Technical indicators + CSE + Gold + Oil + USD Index |
|
|
- MLflow tracking + Airflow DAG at 4 AM |
|
|
|
|
|
✅ **Stock Price Prediction ML Pipeline**: |
|
|
- Multi-Architecture: LSTM, GRU, BiLSTM, BiGRU |
|
|
- Optuna hyperparameter tuning (30 trials per stock) |
|
|
- Per-stock best model selection |
|
|
- 10 top CSE stocks (JKH, COMB, DIAL, HNB, etc.) |
|
|
|
|
|
✅ **RAG-Powered Chatbot**: |
|
|
- Chat-history aware Q&A |
|
|
- Queries all ChromaDB intelligence collections |
|
|
- Domain filtering (political, economic, weather, social) |
|
|
- Floating chat UI in dashboard |
|
|
|
|
|
✅ **Trending/Velocity Detection**: |
|
|
- SQLite-based topic frequency tracking (24-hour rolling window) |
|
|
- Momentum calculation: `current_hour / avg_last_6_hours` |
|
|
- Spike alerts when topic volume > 3x baseline |
|
|
- Integrated into Combined Agent dashboard |
|
|
|
|
|
✅ **Real-Time Dashboard** with: |
|
|
- Live Intelligence Feed |
|
|
- Floating AI Chatbox |
|
|
- Weather Predictions Tab |
|
|
- **Live Satellite/Weather Map** (Windy.com) |
|
|
- **National Flood Threat Score** |
|
|
- **30-Year Historical Climate Analysis** |
|
|
- **Trending Topics & Spike Alerts** |
|
|
- **Enhanced Operational Indicators** (infrastructure_health, regulatory_activity, investment_climate) |
|
|
- Operational Risk Radar |
|
|
- ML Anomaly Detection Display |
|
|
- Market Predictions with Moving Averages |
|
|
- Risk & Opportunity Classification |
|
|
|
|
|
✅ **Weather Data Scraper for ML Training**: |
|
|
- Open-Meteo API (free historical data) |
|
|
- NASA FIRMS (fire/heat detection) |
|
|
- All 25 districts coverage |
|
|
- Year-wise CSV export for model training |
|
|
|
|
|
✅ **Operational Dashboard Metrics**: |
|
|
- **Logistics Friction**: Average confidence of mobility/social domain risk events |
|
|
- **Compliance Volatility**: Average confidence of political domain risks |
|
|
- **Market Instability**: Average confidence of market/economical domain risks |
|
|
- **Opportunity Index**: Average confidence of opportunity-classified events |
|
|
|
|
|
✅ **Multi-District Province-Aware Event Categorization**: |
|
|
- Events mentioning provinces are displayed in all constituent districts |
|
|
- Supports: Western, Southern, Central, Northern, Eastern, Sabaragamuwa, Uva, North Western, North Central provinces |
|
|
- Both frontend (MapView, DistrictInfoPanel) and backend are synchronized |
|
|
|
|
|
✅ **3-Tier Storage Architecture** with Deduplication: |
|
|
- **Tier 1: SQLite** - Fast hash-based exact match (microseconds) |
|
|
- **Tier 2: ChromaDB** - Semantic similarity search with sentence transformers (milliseconds) |
|
|
- **Tier 3: Neo4j Aura** - Knowledge graph for event relationships and entity tracking |
|
|
- Unified `StorageManager` orchestrates all backends |
|
|
- Deduplication prevents duplicate feeds across all domain agents |
|
|
|
|
|
--- |
|
|
|
|
|
## 🏗️ System Architecture |
|
|
|
|
|
``` |
|
|
┌─────────────────────────────────────────────────────────────────────────┐ |
|
|
│ Roger Combined Graph │ |
|
|
│ ┌────────────────────────────────────────────────────────────────┐ │ |
|
|
│ │ Graph Initiator (Reset) │ │ |
|
|
│ └────────────────────────────────────────────────────────────────┘ │ |
|
|
│ │ Fan-Out │ |
|
|
│ ┌────────────┬────────────┼────────────┬────────────┬────────────┐ │ |
|
|
│ ▼ ▼ ▼ ▼ ▼ ▼ │ |
|
|
│ ┌──────┐ ┌──────┐ ┌──────────┐ ┌──────┐ ┌──────────┐ ┌────┐│ |
|
|
│ │Social│ │Econ │ │Political │ │Meteo │ │Intellig- │ │Data││ |
|
|
│ │Agent │ │Agent │ │Agent │ │Agent │ │ence Agent│ │Retr││ |
|
|
│ └──────┘ └──────┘ └──────────┘ └──────┘ └──────────┘ └────┘│ |
|
|
│ │ │ │ │ │ │ │ |
|
|
│ └────────────┴────────────┴────────────┴────────────┴────────────┘ │ |
|
|
│ │ Fan-In │ |
|
|
│ ┌─────────▼──────────┐ │ |
|
|
│ │ Feed Aggregator │ │ |
|
|
│ │ (Rank & Dedupe) │ │ |
|
|
│ └─────────┬──────────┘ │ |
|
|
│ ┌─────────▼──────────┐ │ |
|
|
│ │ Vectorization │ │ |
|
|
│ │ Agent (Optional) │ │ |
|
|
│ └─────────┬──────────┘ │ |
|
|
│ ┌─────────▼──────────┐ │ |
|
|
│ │ Router (Loop/End) │ │ |
|
|
│ └────────────────────┘ │ |
|
|
└─────────────────────────────────────────────────────────────────────────┘ |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 📊 Graph Implementations |
|
|
|
|
|
### 1. Combined Agent Graph (`combinedAgentGraph.py`) |
|
|
**The Mother Graph** - Orchestrates all domain agents in parallel. |
|
|
|
|
|
```mermaid |
|
|
graph TD |
|
|
A[Graph Initiator] -->|Fan-Out| B[Social Agent] |
|
|
A -->|Fan-Out| C[Economic Agent] |
|
|
A -->|Fan-Out| D[Political Agent] |
|
|
A -->|Fan-Out| E[Meteorological Agent] |
|
|
A -->|Fan-Out| F[Intelligence Agent] |
|
|
A -->|Fan-Out| G[Data Retrieval Agent] |
|
|
B -->|Fan-In| H[Feed Aggregator] |
|
|
C --> H |
|
|
D --> H |
|
|
E --> H |
|
|
F --> H |
|
|
G --> H |
|
|
H --> I[Data Refresher] |
|
|
I --> J{Router} |
|
|
J -->|Loop| A |
|
|
J -->|End| K[END] |
|
|
``` |
|
|
|
|
|
**Key Features:** |
|
|
- Custom state reducers for parallel execution |
|
|
- Feed deduplication with content hashing |
|
|
- Loop control with configurable intervals |
|
|
- Real-time WebSocket broadcasting |
|
|
|
|
|
**Architecture Improvements (v2.1):** |
|
|
- **Rate Limiting**: Domain-specific rate limits prevent anti-bot detection |
|
|
- Twitter: 15 RPM, LinkedIn: 10 RPM, News: 60 RPM |
|
|
- Thread-safe semaphores for max concurrent requests |
|
|
- **Error Handling**: Per-agent try/catch prevents cascading failures |
|
|
- Failed agents return empty results, others continue |
|
|
- **Non-Blocking Refresh**: 60-second cycle with interruptible sleep |
|
|
- `threading.Event.wait()` instead of blocking `time.sleep()` |
|
|
|
|
|
### Storage Data Flow |
|
|
|
|
|
``` |
|
|
┌─────────────────────────────────────────────────────────────────────────────┐ |
|
|
│ DOMAIN AGENTS (Parallel) │ |
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │ |
|
|
│ │ Social │ │Political │ │Economic │ │ Meteo │ │ Intelligence │ │ |
|
|
│ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │ |
|
|
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └──────┬───────┘ │ |
|
|
│ └────────────┴────────────┴────────────┴──────────────┘ │ |
|
|
│ │ Fan-In │ |
|
|
│ ┌────────────▼─────────────┐ │ |
|
|
│ │ CombinedAgentNode │ │ |
|
|
│ │ (LLM Filter + Rank) │ │ |
|
|
│ └────────────┬─────────────┘ │ |
|
|
└─────────────────────────────────┼───────────────────────────────────────────┘ |
|
|
│ |
|
|
┌─────────────▼──────────────┐ |
|
|
│ StorageManager │ |
|
|
│ (3-Tier Deduplication) │ |
|
|
└─────────────┬──────────────┘ |
|
|
┌───────────────────────┼──────────────────────────┐ |
|
|
│ │ │ |
|
|
▼ ▼ ▼ |
|
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────────────┐ |
|
|
│ SQLite │ │ ChromaDB │ │ Neo4j Aura │ |
|
|
│ (Fast Cache) │ │ (Vector Store) │ │ (Knowledge Graph) │ |
|
|
│ ───────────── │ │ ────────────── │ │ ─────────────────── │ |
|
|
│ Hash-based │ │ Semantic search │ │ Event relationships │ |
|
|
│ Exact match │ │ Similarity 0.85 │ │ Domain nodes │ |
|
|
│ ~microseconds │ │ ~milliseconds │ │ Entity tracking │ |
|
|
└─────────────────┘ └──────────────────┘ └─────────────────────────┘ |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
### 2. Political Agent Graph (`politicalAgentGraph.py`) |
|
|
**3-Module Hybrid Architecture** |
|
|
|
|
|
| Module | Description | Sources | |
|
|
|--------|-------------|---------| |
|
|
| **Official Sources** | Government data | Gazette, Parliament Minutes | |
|
|
| **Social Media** | Political sentiment | Twitter, Facebook, Reddit (National + 25 Districts) | |
|
|
| **Feed Generation** | LLM Processing | Categorize → Summarize → Format | |
|
|
|
|
|
``` |
|
|
┌─────────────────────────────────────────────┐ |
|
|
│ Module 1: Official │ Module 2: Social │ |
|
|
│ ┌─────────────────┐ │ ┌───────────────┐ │ |
|
|
│ │ Gazette │ │ │ National │ │ |
|
|
│ │ Parliament │ │ │ Districts (25)│ │ |
|
|
│ └─────────────────┘ │ │ World Politics│ │ |
|
|
│ │ └───────────────┘ │ |
|
|
└────────────┬───────────┴────────┬──────────┘ |
|
|
│ Fan-In │ |
|
|
▼ ▼ |
|
|
┌────────────────────────────┐ |
|
|
│ Module 3: Feed Generation │ |
|
|
│ Categorize → LLM → Format │ |
|
|
└────────────────────────────┘ |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
### 3. Economic Agent Graph (`economicalAgentGraph.py`) |
|
|
**Market Intelligence & Technical Analysis** |
|
|
|
|
|
| Component | Description | |
|
|
|-----------|-------------| |
|
|
| **Stock Collector** | CSE market data (200+ stocks) | |
|
|
| **Technical Analyzer** | SMA, EMA, RSI, MACD | |
|
|
| **Trend Detector** | Bullish/Bearish signals | |
|
|
| **Feed Generator** | Risk/Opportunity classification | |
|
|
|
|
|
**Indicators Calculated:** |
|
|
- Simple Moving Average (SMA-20, SMA-50) |
|
|
- Exponential Moving Average (EMA-12, EMA-26) |
|
|
- Relative Strength Index (RSI) |
|
|
- MACD with Signal Line |
|
|
|
|
|
--- |
|
|
|
|
|
### 4. Meteorological Agent Graph (`meteorologicalAgentGraph.py`) |
|
|
**Weather & Disaster Monitoring + FloodWatch Integration** |
|
|
|
|
|
``` |
|
|
┌─────────────────────────────────────┐ |
|
|
│ DMC Weather Collector │ |
|
|
│ (Daily forecasts, 25 districts) │ |
|
|
└─────────────┬───────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────┐ |
|
|
│ RiverNet Data Collector │ |
|
|
│ (River levels, flood monitoring) │ |
|
|
└─────────────┬───────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────┐ |
|
|
│ FloodWatch Historical Data │ |
|
|
│ (30-year climate analysis) │ |
|
|
└─────────────┬───────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────┐ |
|
|
│ National Threat Calculator │ |
|
|
│ (Aggregated flood risk 0-100) │ |
|
|
└─────────────┬───────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────┐ |
|
|
│ Alert Generator │ |
|
|
│ (Severity classification) │ |
|
|
└─────────────────────────────────────┘ |
|
|
``` |
|
|
|
|
|
**Alert Levels:** |
|
|
- 🟢 Normal: Standard conditions |
|
|
- 🟡 Advisory: Watch for developments |
|
|
- 🟠 Warning: Take precautions |
|
|
- 🔴 Critical: Immediate action required |
|
|
|
|
|
**FloodWatch Features:** |
|
|
| Feature | Description | |
|
|
|---------|-------------| |
|
|
| **Historical Analysis** | 30-year climate data (1995-2025) | |
|
|
| **Decadal Comparison** | 3 periods: 1995-2004, 2005-2014, 2015-2025 | |
|
|
| **National Threat Score** | 0-100 aggregated risk from rivers + alerts + season | |
|
|
| **High-Risk Periods** | May-Jun (SW Monsoon), Oct-Nov (NE Monsoon) | |
|
|
|
|
|
--- |
|
|
|
|
|
### 5. Social Agent Graph (`socialAgentGraph.py`) |
|
|
**Multi-Platform Social Media Monitoring** |
|
|
|
|
|
| Platform | Data Source | Coverage | |
|
|
|----------|-------------|----------| |
|
|
| Reddit | PRAW API | r/srilanka, r/colombo | |
|
|
| Twitter/X | Nitter scraping | #SriLanka, #Colombo | |
|
|
| Facebook | Profile scraping | News pages | |
|
|
| Threads | Meta API | Trending topics | |
|
|
| BlueSky | AT Protocol | Political discourse | |
|
|
|
|
|
--- |
|
|
|
|
|
### 6. Intelligence Agent Graph (`intelligenceAgentGraph.py`) |
|
|
**Brand & Threat Monitoring + User-Configurable Targets** |
|
|
|
|
|
``` |
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ |
|
|
│ Brand Monitor │ │ Threat Scanner │ │ User Targets │ |
|
|
│ - Company news │ │ - Security │ │ - Custom keys │ |
|
|
│ - Competitor │ │ - Compliance │ │ - User profiles │ |
|
|
│ - Market share │ │ - Geopolitical │ │ - Products │ |
|
|
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘ |
|
|
│ │ │ |
|
|
└──────────────────────┼──────────────────────┘ |
|
|
▼ |
|
|
┌─────────────────────┐ |
|
|
│ Intelligence Report │ |
|
|
│ (Priority ranked) │ |
|
|
└─────────────────────┘ |
|
|
``` |
|
|
|
|
|
**User-Configurable Monitoring**: |
|
|
Users can define custom monitoring targets via the frontend settings panel or API: |
|
|
|
|
|
| Config Type | Description | Example | |
|
|
|-------------|-------------|---------| |
|
|
| **Keywords** | Custom search terms | "Colombo Port", "BOI Investment" | |
|
|
| **Products** | Products to track | "iPhone 15", "Samsung Galaxy" | |
|
|
| **Profiles** | Social media accounts | @CompetitorX (Twitter), CompanyY (Facebook) | |
|
|
|
|
|
**API Endpoints:** |
|
|
```bash |
|
|
# Get current config |
|
|
GET /api/intel/config |
|
|
|
|
|
# Update full config |
|
|
POST /api/intel/config |
|
|
Body: {"user_keywords": ["keyword1"], "user_profiles": {"twitter": ["@account"]}, "user_products": ["Product"]} |
|
|
|
|
|
# Add single target |
|
|
POST /api/intel/config/add?target_type=keyword&value=Colombo+Port |
|
|
|
|
|
# Remove target |
|
|
DELETE /api/intel/config/remove?target_type=profile&value=CompetitorX&platform=twitter |
|
|
``` |
|
|
|
|
|
**Config File**: `src/config/intel_config.json` |
|
|
|
|
|
--- |
|
|
|
|
|
### 7. DATA Retrieval Agent Graph (`dataRetrievalAgentGraph.py`) |
|
|
**Web Scraping Orchestrator** |
|
|
|
|
|
**Scraping Tools Available:** |
|
|
- `scrape_news_site` - Generic news scraper |
|
|
- `scrape_cse_live` - CSE stock prices |
|
|
- `scrape_official_data` - Government portals |
|
|
- `scrape_social_media` - Multi-platform |
|
|
|
|
|
**Anti-Bot Features:** |
|
|
- Random delays (1-3s) |
|
|
- User-agent rotation |
|
|
- Retry with exponential backoff |
|
|
- Headless browser fallback |
|
|
|
|
|
--- |
|
|
|
|
|
### 8. Vectorization Agent Graph (`vectorizationAgentGraph.py`) |
|
|
**6-Step Multilingual NLP Pipeline with Anomaly + Trending Detection** |
|
|
|
|
|
``` |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Step 1: Language Detection │ |
|
|
│ FastText + Unicode script analysis │ |
|
|
│ Supports: English, Sinhala (සිංහල), Tamil (தமிழ்)│ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Step 2: Text Vectorization │ |
|
|
│ ┌─────────────┬─────────────┬─────────────────┐ │ |
|
|
│ │ DistilBERT │ SinhalaBERTo│ Tamil-BERT │ │ |
|
|
│ │ (English) │ (Sinhala) │ (Tamil) │ │ |
|
|
│ └─────────────┴─────────────┴─────────────────┘ │ |
|
|
│ Output: 768-dim vector per text │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Step 3: Anomaly Detection (Isolation Forest) │ |
|
|
│ - English: ML model inference │ |
|
|
│ - Sinhala/Tamil: Skipped (incompatible vectors) │ |
|
|
│ - Outputs anomaly_score (0-1) │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Step 4: Trending Detection │ |
|
|
│ - Entity extraction (hashtags, proper nouns) │ |
|
|
│ - Momentum: current_hour / avg_last_6_hours │ |
|
|
│ - Spike alerts when momentum > 3x │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Step 5: Expert Summary (GroqLLM) │ |
|
|
│ - Opportunity & threat identification │ |
|
|
│ - Sentiment analysis │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Step 6: Format Output │ |
|
|
│ - Includes anomaly + trending in domain_insights│ |
|
|
└─────────────────────────────────────────────────┘ |
|
|
``` |
|
|
|
|
|
**Trending Detection API Endpoints:** |
|
|
|
|
|
| Endpoint | Method | Description | |
|
|
|----------|--------|-------------| |
|
|
| `/api/trending` | GET | Get trending topics & spike alerts | |
|
|
| `/api/trending/topic/{topic}` | GET | Get hourly history for a topic | |
|
|
| `/api/trending/record` | POST | Record a topic mention (testing) | |
|
|
|
|
|
--- |
|
|
|
|
|
### 10. Weather Prediction Pipeline (`models/weather-prediction/`) |
|
|
**LSTM-Based Multi-District Weather Forecasting** |
|
|
|
|
|
``` |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Data Source: Tutiempo.net (21 stations) │ |
|
|
│ Historical data since 1944 │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ LSTM Neural Network │ |
|
|
│ ┌─────────────────────────────────────────────┐ │ |
|
|
│ │ Input: 30-day sequence (11 features) │ │ |
|
|
│ │ Layer 1: LSTM(64) + BatchNorm + Dropout │ │ |
|
|
│ │ Layer 2: LSTM(32) + BatchNorm + Dropout │ │ |
|
|
│ │ Output: Dense(3) → temp_max, temp_min, rain │ │ |
|
|
│ └─────────────────────────────────────────────┘ │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Severity Classifier │ |
|
|
│ - Combines temp, rainfall, flood risk │ |
|
|
│ - Outputs: normal/advisory/warning/critical │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Output: 25 District Predictions │ |
|
|
│ - Temperature (high/low °C) │ |
|
|
│ - Rainfall (mm + probability) │ |
|
|
│ - Flood risk (integrated with RiverNet) │ |
|
|
└─────────────────────────────────────────────────┘ |
|
|
``` |
|
|
|
|
|
**Usage:** |
|
|
```bash |
|
|
# Run full pipeline |
|
|
cd models/weather-prediction |
|
|
python main.py --mode full |
|
|
|
|
|
# Just predictions |
|
|
python main.py --mode predict |
|
|
|
|
|
# Train specific station |
|
|
python main.py --mode train --station COLOMBO |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
### 11. Currency Prediction Pipeline (`models/currency-volatility-prediction/`) |
|
|
**GRU-Based USD/LKR Exchange Rate Forecasting** |
|
|
|
|
|
``` |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Data Sources (yfinance) │ |
|
|
│ - USD/LKR exchange rate │ |
|
|
│ - CSE stock index (correlation) │ |
|
|
│ - Gold, Oil prices (global factors) │ |
|
|
│ - USD strength index │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Feature Engineering (25+ features) │ |
|
|
│ - SMA, EMA, RSI, MACD, Bollinger Bands │ |
|
|
│ - Volatility, Momentum indicators │ |
|
|
│ - Temporal encoding (day/month cycles) │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ GRU Neural Network (8GB RAM optimized) │ |
|
|
│ ┌─────────────────────────────────────────────┐ │ |
|
|
│ │ Input: 30-day sequence │ │ |
|
|
│ │ Layer 1: GRU(64) + BatchNorm + Dropout │ │ |
|
|
│ │ Layer 2: GRU(32) + BatchNorm + Dropout │ │ |
|
|
│ │ Output: Dense(1) → next_day_rate │ │ |
|
|
│ └─────────────────────────────────────────────┘ │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Output: USD/LKR Prediction │ |
|
|
│ - Current & predicted rate │ |
|
|
│ - Change % and direction │ |
|
|
│ - Volatility classification (low/medium/high) │ |
|
|
└─────────────────────────────────────────────────┘ |
|
|
``` |
|
|
|
|
|
**Usage:** |
|
|
```bash |
|
|
# Run full pipeline |
|
|
cd models/currency-volatility-prediction |
|
|
python main.py --mode full |
|
|
|
|
|
# Just predict |
|
|
python main.py --mode predict |
|
|
|
|
|
# Train GRU model |
|
|
python main.py --mode train --epochs 100 |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
### 12. RAG Chatbot (`src/rag.py`) |
|
|
**Chat-History Aware Intelligence Q&A** |
|
|
|
|
|
``` |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ MultiCollectionRetriever │ |
|
|
│ - Connects to ChromaDB intelligence collection │ |
|
|
│ - Roger_feeds (all agent domain feeds) │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Question Reformulation (History-Aware) │ |
|
|
│ - Uses last 3-5 exchanges for context │ |
|
|
│ - Reformulates follow-up questions │ |
|
|
└─────────────────┬───────────────────────────────┘ |
|
|
│ |
|
|
▼ |
|
|
┌─────────────────────────────────────────────────┐ |
|
|
│ Groq LLM (llama-3.1-70b-versatile) │ |
|
|
│ - RAG with source citations │ |
|
|
│ - Domain-specific analysis │ |
|
|
└─────────────────────────────────────────────────┘ |
|
|
``` |
|
|
|
|
|
**Usage:** |
|
|
```bash |
|
|
# CLI mode |
|
|
python src/rag.py |
|
|
|
|
|
# Or via API |
|
|
curl -X POST http://localhost:8000/api/rag/chat \ |
|
|
-H "Content-Type: application/json" \ |
|
|
-d '{"message": "What are the latest political events?"}' |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 🤖 ML Anomaly Detection Pipeline |
|
|
|
|
|
Located in `models/anomaly-detection/` |
|
|
|
|
|
### Pipeline Components |
|
|
|
|
|
| Component | File | Description | |
|
|
|-----------|------|-------------| |
|
|
| Data Ingestion | `data_ingestion.py` | SQLite + CSV fetching | |
|
|
| Data Validation | `data_validation.py` | Schema-based validation | |
|
|
| Data Transformation | `data_transformation.py` | Language detection + BERT vectorization | |
|
|
| Model Trainer | `model_trainer.py` | Optuna + MLflow training | |
|
|
|
|
|
### Clustering Models |
|
|
|
|
|
| Model | Type | Use Case | |
|
|
|-------|------|----------| |
|
|
| **DBSCAN** | Density-based | Noise-robust clustering | |
|
|
| **KMeans** | Centroid-based | Fast, fixed k clusters | |
|
|
| **HDBSCAN** | Hierarchical density | Variable density clusters | |
|
|
| **Isolation Forest** | Anomaly detection | Outlier identification | |
|
|
| **LOF** | Local outlier | Density-based anomalies | |
|
|
|
|
|
### Training with Optuna |
|
|
|
|
|
```python |
|
|
# Hyperparameter optimization |
|
|
study = optuna.create_study(direction="maximize") |
|
|
study.optimize(objective, n_trials=50) |
|
|
``` |
|
|
|
|
|
### MLflow Tracking |
|
|
|
|
|
```python |
|
|
mlflow.set_tracking_uri("https://dagshub.com/...") |
|
|
mlflow.log_params(best_params) |
|
|
mlflow.log_metrics(metrics) |
|
|
mlflow.sklearn.log_model(model, "model") |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 🌧️ Weather Data Scraper (`scripts/scrape_weather_data.py`) |
|
|
|
|
|
**Historical weather data collection for ML model training** |
|
|
|
|
|
### Data Sources |
|
|
|
|
|
| Source | API Key? | Data Available | |
|
|
|--------|----------|----------------| |
|
|
| **Open-Meteo** | ❌ Free | Historical weather since 1940 | |
|
|
| **NASA FIRMS** | ✅ Optional | Fire/heat spot detection | |
|
|
|
|
|
### Collected Weather Variables |
|
|
|
|
|
- `temperature_2m_max/min/mean` |
|
|
- `precipitation_sum`, `rain_sum` |
|
|
- `precipitation_hours` |
|
|
- `wind_speed_10m_max`, `wind_gusts_10m_max` |
|
|
- `wind_direction_10m_dominant` |
|
|
|
|
|
### Usage |
|
|
|
|
|
```bash |
|
|
# Scrape last 30 days (default) |
|
|
python scripts/scrape_weather_data.py |
|
|
|
|
|
# Scrape specific date range |
|
|
python scripts/scrape_weather_data.py --start 2020-01-01 --end 2024-12-31 |
|
|
|
|
|
# Scrape multiple years for training dataset |
|
|
python scripts/scrape_weather_data.py --years 2020,2021,2022,2023,2024 |
|
|
|
|
|
# Include fire detection data |
|
|
python scripts/scrape_weather_data.py --years 2023,2024 --fires |
|
|
|
|
|
# Hourly resolution (default is daily) |
|
|
python scripts/scrape_weather_data.py --start 2024-01-01 --end 2024-01-31 --resolution hourly |
|
|
``` |
|
|
|
|
|
### Output |
|
|
|
|
|
``` |
|
|
datasets/weather/ |
|
|
├── weather_daily_2020-01-01_2020-12-31.csv |
|
|
├── weather_daily_2021-01-01_2021-12-31.csv |
|
|
├── weather_combined.csv (merged file) |
|
|
└── fire_detections_20241207.csv |
|
|
``` |
|
|
|
|
|
### Coverage |
|
|
|
|
|
All 25 Sri Lankan districts with coordinates: |
|
|
- Colombo, Gampaha, Kalutara, Kandy, Matale, Nuwara Eliya |
|
|
- Galle, Matara, Hambantota, Jaffna, Kilinochchi, Mannar |
|
|
- Vavuniya, Mullaitivu, Batticaloa, Ampara, Trincomalee |
|
|
- Kurunegala, Puttalam, Anuradhapura, Polonnaruwa |
|
|
- Badulla, Monaragala, Ratnapura, Kegalle |
|
|
|
|
|
--- |
|
|
|
|
|
## 🚀 Quick Start |
|
|
|
|
|
### Prerequisites |
|
|
- Python 3.11+ |
|
|
- Node.js 18+ |
|
|
- Docker Desktop (for Airflow) |
|
|
- Groq API Key |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
# 1. Clone repository |
|
|
git clone <your-repo> |
|
|
cd Roger-Final |
|
|
|
|
|
# 2. Create virtual environment |
|
|
python -m venv .venv |
|
|
source .venv/bin/activate # Linux/Mac |
|
|
.\.venv\Scripts\activate # Windows |
|
|
|
|
|
# 3. Install dependencies |
|
|
pip install -r requirements.txt |
|
|
|
|
|
# 4. Configure environment |
|
|
cp .env.template .env |
|
|
# Edit .env with your API keys |
|
|
|
|
|
# 5. Download ML models |
|
|
python models/anomaly-detection/download_models.py |
|
|
|
|
|
# 6. Launch all services |
|
|
./start_services.sh # Linux/Mac |
|
|
.\start_services.ps1 # Windows |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 🔧 API Endpoints |
|
|
|
|
|
### REST API (FastAPI - Port 8000) |
|
|
|
|
|
| Endpoint | Method | Description | |
|
|
|----------|--------|-------------| |
|
|
| `/api/status` | GET | System health | |
|
|
| `/api/dashboard` | GET | Risk metrics | |
|
|
| `/api/feed` | GET | Latest events | |
|
|
| `/api/feeds` | GET | All feeds with pagination | |
|
|
| `/api/feeds/by_district` | GET | Feeds filtered by district | |
|
|
| `/api/rivernet` | GET | River monitoring data | |
|
|
| `/api/predict` | POST | Run anomaly predictions | |
|
|
| `/api/anomalies` | GET | Get anomalous feeds | |
|
|
| `/api/model/status` | GET | ML model status | |
|
|
| `/api/weather/predictions` | GET | All district forecasts | |
|
|
| `/api/weather/predictions/{district}` | GET | Single district | |
|
|
| `/api/weather/model/status` | GET | Weather model info | |
|
|
| `/api/weather/historical` | GET | 30-year climate analysis | |
|
|
| `/api/weather/threat` | GET | National flood threat score | |
|
|
| `/api/currency/prediction` | GET | USD/LKR next-day forecast | |
|
|
| `/api/currency/history` | GET | Historical rates | |
|
|
| `/api/currency/model/status` | GET | Currency model info | |
|
|
| `/api/stocks/predictions` | GET | All CSE stock forecasts | |
|
|
| `/api/stocks/predictions/{symbol}` | GET | Single stock prediction | |
|
|
| `/api/stocks/model/status` | GET | Stock models info | |
|
|
| `/api/rag/chat` | POST | Chat with RAG | |
|
|
| `/api/rag/stats` | GET | RAG system stats | |
|
|
| `/api/rag/clear` | POST | Clear chat history | |
|
|
| `/api/power` | GET | CEB power/load shedding status | |
|
|
| `/api/fuel` | GET | Current fuel prices | |
|
|
| `/api/economy` | GET | CBSL economic indicators | |
|
|
| `/api/health` | GET | Health alerts & dengue data | |
|
|
| `/api/commodities` | GET | Essential goods prices | |
|
|
| `/api/water` | GET | Water supply disruptions | |
|
|
|
|
|
### WebSocket |
|
|
- `ws://localhost:8000/ws` - Real-time updates |
|
|
|
|
|
--- |
|
|
|
|
|
## ⏰ Airflow Orchestration |
|
|
|
|
|
### DAG: `anomaly_detection_training` |
|
|
|
|
|
``` |
|
|
start → check_records → data_ingestion → data_validation |
|
|
→ data_transformation → model_training → end |
|
|
``` |
|
|
|
|
|
**Triggers:** |
|
|
- Batch threshold: 1000 new records |
|
|
- Daily fallback: Every 24 hours |
|
|
|
|
|
**Access Dashboard:** |
|
|
```bash |
|
|
cd models/anomaly-detection |
|
|
astro dev start |
|
|
# Open http://localhost:8080 |
|
|
``` |
|
|
|
|
|
### DAG: `weather_prediction_daily` |
|
|
|
|
|
``` |
|
|
ingest_data → train_models → generate_predictions → publish_predictions |
|
|
``` |
|
|
|
|
|
**Schedule:** Daily at 4:00 AM IST |
|
|
|
|
|
**Tasks:** |
|
|
- Scrape Tutiempo.net for latest data |
|
|
- Train LSTM models (MLflow tracked) |
|
|
- Generate 25-district predictions |
|
|
- Save to JSON for API |
|
|
|
|
|
### DAG: `currency_prediction_daily` |
|
|
|
|
|
``` |
|
|
ingest_data → train_model → generate_prediction → publish_prediction |
|
|
``` |
|
|
|
|
|
**Schedule:** Daily at 4:00 AM IST |
|
|
|
|
|
**Tasks:** |
|
|
- Fetch USD/LKR + indicators from yfinance |
|
|
- Train GRU model (MLflow tracked) |
|
|
- Generate next-day prediction |
|
|
- Save to JSON for API |
|
|
|
|
|
--- |
|
|
|
|
|
## 📁 Project Structure |
|
|
|
|
|
``` |
|
|
Roger-Ultimate/ |
|
|
├── src/ |
|
|
│ ├── graphs/ # LangGraph definitions |
|
|
│ │ ├── combinedAgentGraph.py # Mother graph |
|
|
│ │ ├── politicalAgentGraph.py |
|
|
│ │ ├── economicalAgentGraph.py |
|
|
│ │ ├── meteorologicalAgentGraph.py |
|
|
│ │ ├── socialAgentGraph.py |
|
|
│ │ ├── intelligenceAgentGraph.py |
|
|
│ │ ├── dataRetrievalAgentGraph.py |
|
|
│ │ └── vectorizationAgentGraph.py # 5-step with anomaly detection |
|
|
│ ├── nodes/ # Agent implementations |
|
|
│ ├── states/ # State definitions |
|
|
│ ├── llms/ # LLM configurations |
|
|
│ ├── storage/ # ChromaDB, SQLite, Neo4j stores |
|
|
│ ├── rag.py # RAG chatbot |
|
|
│ └── utils/ |
|
|
│ └── utils.py # Tools incl. FloodWatch |
|
|
├── scripts/ |
|
|
│ └── scrape_weather_data.py # Weather data scraper |
|
|
├── models/ |
|
|
│ ├── anomaly-detection/ # ML Anomaly Pipeline |
|
|
│ │ ├── src/ |
|
|
│ │ │ ├── components/ # Pipeline stages |
|
|
│ │ │ ├── entity/ # Config/Artifact classes |
|
|
│ │ │ ├── pipeline/ # Orchestrators |
|
|
│ │ │ └── utils/ # Vectorizer, metrics |
|
|
│ │ ├── dags/ # Airflow DAGs |
|
|
│ │ ├── data_schema/ # Validation schemas |
|
|
│ │ ├── output/ # Trained models |
|
|
│ │ └── models_cache/ # Downloaded BERT models |
|
|
│ ├── weather-prediction/ # Weather ML Pipeline |
|
|
│ │ ├── src/components/ # data_ingestion, model_trainer, predictor |
|
|
│ │ ├── dags/ # weather_prediction_dag.py (4 AM) |
|
|
│ │ ├── artifacts/ # Trained LSTM models (.h5) |
|
|
│ │ └── main.py # CLI entry point |
|
|
│ └── currency-volatility-prediction/ # Currency ML Pipeline |
|
|
│ ├── src/components/ # data_ingestion, model_trainer, predictor |
|
|
│ ├── dags/ # currency_prediction_dag.py (4 AM) |
|
|
│ ├── artifacts/ # Trained GRU model |
|
|
│ └── main.py # CLI entry point |
|
|
├── datasets/ |
|
|
│ └── weather/ # Scraped weather CSVs |
|
|
├── frontend/ |
|
|
│ └── app/ |
|
|
│ ├── components/ |
|
|
│ │ ├── dashboard/ |
|
|
│ │ │ ├── AnomalyDetection.tsx |
|
|
│ │ │ ├── WeatherPredictions.tsx |
|
|
│ │ │ ├── CurrencyPrediction.tsx |
|
|
│ │ │ ├── NationalThreatCard.tsx # Flood threat score |
|
|
│ │ │ ├── HistoricalIntel.tsx # 30-year climate |
|
|
│ │ │ └── ... |
|
|
│ │ ├── map/ |
|
|
│ │ │ ├── MapView.tsx |
|
|
│ │ │ └── SatelliteView.tsx # Windy.com embed |
|
|
│ │ ├── FloatingChatBox.tsx # RAG chat UI |
|
|
│ │ └── ... |
|
|
│ └── pages/ |
|
|
│ └── Index.tsx # 7 tabs incl. SATELLITE |
|
|
├── main.py # FastAPI backend |
|
|
├── start.sh # Startup script |
|
|
└── requirements.txt |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 🔐 Environment Variables |
|
|
|
|
|
```env |
|
|
# LLM |
|
|
GROQ_API_KEY=your_groq_key |
|
|
|
|
|
# Neo4j (Knowledge Graph) |
|
|
NEO4J_URI=neo4j+s://your-instance.databases.neo4j.io |
|
|
NEO4J_USERNAME=neo4j |
|
|
NEO4J_PASSWORD=your_password |
|
|
NEO4J_ENABLED=true |
|
|
NEO4J_DATABASE=neo4j |
|
|
|
|
|
# ChromaDB (Vector Store) |
|
|
CHROMADB_PATH=./data/chromadb |
|
|
CHROMADB_COLLECTION=Roger_feeds |
|
|
CHROMADB_SIMILARITY_THRESHOLD=0.85 |
|
|
|
|
|
# SQLite (Fast Cache) |
|
|
SQLITE_DB_PATH=./data/cache/feeds.db |
|
|
|
|
|
# MLflow (DagsHub) |
|
|
MLFLOW_TRACKING_URI=https://dagshub.com/... |
|
|
MLFLOW_TRACKING_USERNAME=... |
|
|
MLFLOW_TRACKING_PASSWORD=... |
|
|
|
|
|
# Pipeline |
|
|
BATCH_THRESHOLD=1000 |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 🧪 Testing Framework |
|
|
|
|
|
Industry-level testing infrastructure for the agentic AI system. |
|
|
|
|
|
### Test Structure |
|
|
|
|
|
``` |
|
|
tests/ |
|
|
├── conftest.py # Pytest fixtures and configuration |
|
|
├── unit/ # Unit tests for individual components |
|
|
│ └── test_utils.py |
|
|
├── integration/ # Multi-component integration tests |
|
|
│ └── test_agent_routing.py |
|
|
├── evaluation/ # LLM-as-Judge evaluation tests |
|
|
│ ├── agent_evaluator.py # Evaluation harness |
|
|
│ ├── adversarial_tests.py # Prompt injection & edge cases |
|
|
│ └── golden_datasets/ |
|
|
│ └── expected_responses.json |
|
|
└── e2e/ # End-to-end workflow tests |
|
|
└── test_full_pipeline.py |
|
|
``` |
|
|
|
|
|
### LangSmith Integration |
|
|
|
|
|
Automatic tracing for all agent decisions when `LANGSMITH_API_KEY` is set. |
|
|
|
|
|
```env |
|
|
# Add to .env |
|
|
LANGSMITH_API_KEY=your_langsmith_api_key |
|
|
LANGSMITH_PROJECT=roger-intelligence # Optional, defaults to 'roger-intelligence' |
|
|
``` |
|
|
|
|
|
**View traces:** [smith.langchain.com](https://smith.langchain.com/) |
|
|
|
|
|
### Running Tests |
|
|
|
|
|
```bash |
|
|
# Run all tests |
|
|
python run_tests.py |
|
|
|
|
|
# Run specific test suites |
|
|
python run_tests.py --unit # Unit tests only |
|
|
python run_tests.py --adversarial # Security/adversarial tests |
|
|
python run_tests.py --eval # LLM-as-Judge evaluation |
|
|
python run_tests.py --e2e # End-to-end tests |
|
|
|
|
|
# With coverage report |
|
|
python run_tests.py --coverage |
|
|
|
|
|
# Enable LangSmith tracing in tests |
|
|
python run_tests.py --with-langsmith |
|
|
``` |
|
|
|
|
|
### Agent Evaluation Harness |
|
|
|
|
|
The `agent_evaluator.py` implements the **LLM-as-Judge** pattern: |
|
|
|
|
|
| Metric | Description | |
|
|
|--------|-------------| |
|
|
| **Tool Selection Accuracy** | Did the agent use the correct tools? | |
|
|
| **Response Quality** | Is the response relevant and coherent? | |
|
|
| **BLEU Score** | N-gram text similarity (0-1, higher = better match) | |
|
|
| **Hallucination Detection** | Did the agent fabricate information? | |
|
|
| **Graceful Degradation** | Does it handle failures properly? | |
|
|
|
|
|
```bash |
|
|
# Run standalone evaluator |
|
|
python tests/evaluation/agent_evaluator.py |
|
|
``` |
|
|
|
|
|
### Adversarial Testing |
|
|
|
|
|
Tests for security and robustness: |
|
|
|
|
|
| Test Category | Description | |
|
|
|--------------|-------------| |
|
|
| **Prompt Injection** | Ignore instructions, jailbreak, context switching | |
|
|
| **Out-of-Domain** | Non-SL queries, illegal requests, impossible questions | |
|
|
| **Malformed Input** | Empty, XSS, SQL injection, unicode flood | |
|
|
| **Graceful Degradation** | API timeouts, empty responses, rate limiting | |
|
|
|
|
|
### CI/CD Pipeline |
|
|
|
|
|
GitHub Actions workflow (`.github/workflows/test.yml`): |
|
|
|
|
|
```yaml |
|
|
on: [push, pull_request] |
|
|
|
|
|
jobs: |
|
|
unit-tests: # Runs on every push |
|
|
adversarial-tests: # Security tests on every push |
|
|
evaluation-tests: # LLM evaluation on main branch only |
|
|
lint: # Code quality checks |
|
|
``` |
|
|
|
|
|
**Required Secrets:** |
|
|
- `LANGSMITH_API_KEY` - For evaluation test logging |
|
|
- `GROQ_API_KEY` - For LLM-based evaluation |
|
|
|
|
|
--- |
|
|
|
|
|
## 🐛 Troubleshooting |
|
|
|
|
|
### FastText won't install on Windows |
|
|
```bash |
|
|
# Use pre-built wheel instead |
|
|
pip install fasttext-wheel |
|
|
``` |
|
|
|
|
|
### BERT models downloading slowly |
|
|
```bash |
|
|
# Pre-download all models |
|
|
python models/anomaly-detection/download_models.py |
|
|
``` |
|
|
|
|
|
### Airflow not starting |
|
|
```bash |
|
|
# Ensure Docker is running |
|
|
docker info |
|
|
|
|
|
# Initialize Astro project |
|
|
cd models/anomaly-detection |
|
|
astro dev init |
|
|
astro dev start |
|
|
``` |
|
|
|
|
|
### NumPy 2.0 / ChromaDB compatibility error |
|
|
```bash |
|
|
# If you see "A module that was compiled using NumPy 1.x cannot be run in NumPy 2.x" |
|
|
pip install "numpy<2.0" |
|
|
|
|
|
# Or upgrade chromadb to latest |
|
|
pip install --upgrade chromadb |
|
|
``` |
|
|
|
|
|
### Keras model loading error ("Could not locate function 'mse'") |
|
|
```bash |
|
|
# If currency/weather models fail to load with Keras 3.x |
|
|
# Retrain the model - it will save in .keras format automatically |
|
|
cd models/currency-volatility-prediction |
|
|
python main.py --mode train |
|
|
|
|
|
# Or for weather |
|
|
cd models/weather-prediction |
|
|
python main.py --mode train |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 📄 License |
|
|
|
|
|
MIT License - Built for Production |
|
|
|
|
|
--- |
|
|
|
|
|
## 🙏 Acknowledgments |
|
|
|
|
|
- **Groq** - High-speed LLM inference |
|
|
- **LangGraph** - Agent orchestration |
|
|
- **HuggingFace** - SinhalaBERTo, Tamil-BERT, DistilBERT |
|
|
- **Optuna** - Hyperparameter optimization |
|
|
- **MLflow** - Experiment tracking |
|
|
- Sri Lankan government for open data sources |
|
|
|
|
|
|