title: Roger Intelligence Platform
emoji: ⚡
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
🇱🇰 Roger Intelligence Platform
Real-Time Situational Awareness for Sri Lanka
A multi-agent AI system that aggregates intelligence from 50+ data sources to provide risk analysis and opportunity detection for businesses operating in Sri Lanka.
🌐 Live Demo
| Component | URL |
|---|---|
| Frontend Dashboard | https://model-x-frontend-snowy.vercel.app/ |
| Backend API | https://nivakaran-Roger.hf.space |
🎯 Key Features
✅ 5 Domain Agents + 2 Orchestrators running in parallel:
- Social Agent - Reddit, Twitter, Facebook, Threads, BlueSky monitoring
- Political Agent - Gazette, Parliament, District Social Media
- Economical Agent - CSE Stock Market + Technical Indicators (SMA, EMA, RSI, MACD)
- Meteorological Agent - DMC Weather + RiverNet + FloodWatch Integration
- Intelligence Agent - Brand Monitoring + Threat Detection + User-Configurable Targets
- Combined Agent (Orchestrator) - Fan-out/Fan-in coordination, LLM filtering, feed ranking
- Data Retrieval Agent - Web scraping orchestration with anti-bot features
✅ Situational Awareness Dashboard:
- CEB Power Status - Load shedding / power outage monitoring
- Fuel Prices - Petrol 92/95, Diesel, Kerosene (CEYPETCO)
- CBSL Economic Indicators - Inflation, policy rates, forex reserves, USD/LKR
- Health Alerts - Dengue case tracking, disease outbreak monitoring
- Commodity Prices - 15 essential goods (rice, sugar, gas, eggs, etc.)
- Water Supply Status - NWSDB disruption alerts
✅ ML Anomaly Detection Pipeline (Integrated into Graph):
- Language-specific BERT models (Sinhala, Tamil, English)
- Real-time anomaly inference on every graph cycle
- Clustering (DBSCAN, KMeans, HDBSCAN)
- Anomaly Detection (Isolation Forest, LOF)
- MLflow + DagsHub tracking
✅ Weather Prediction ML Pipeline:
- LSTM Neural Network (30-day sequences)
- Predicts: Temperature, Rainfall, Flood Risk, Severity
- 21 weather stations → 25 districts
- Airflow DAG runs daily at 4 AM
✅ Currency Prediction ML Pipeline:
- GRU Neural Network (optimized for 8GB RAM)
- Predicts: USD/LKR exchange rate
- Features: Technical indicators + CSE + Gold + Oil + USD Index
- MLflow tracking + Airflow DAG at 4 AM
✅ Stock Price Prediction ML Pipeline:
- Multi-Architecture: LSTM, GRU, BiLSTM, BiGRU
- Optuna hyperparameter tuning (30 trials per stock)
- Per-stock best model selection
- 10 top CSE stocks (JKH, COMB, DIAL, HNB, etc.)
✅ RAG-Powered Chatbot:
- Chat-history aware Q&A
- Queries all ChromaDB intelligence collections
- Domain filtering (political, economic, weather, social)
- Floating chat UI in dashboard
✅ Trending/Velocity Detection:
- SQLite-based topic frequency tracking (24-hour rolling window)
- Momentum calculation:
current_hour / avg_last_6_hours - Spike alerts when topic volume > 3x baseline
- Integrated into Combined Agent dashboard
✅ Real-Time Dashboard with:
- Live Intelligence Feed
- Floating AI Chatbox
- Weather Predictions Tab
- Live Satellite/Weather Map (Windy.com)
- National Flood Threat Score
- 30-Year Historical Climate Analysis
- Trending Topics & Spike Alerts
- Enhanced Operational Indicators (infrastructure_health, regulatory_activity, investment_climate)
- Operational Risk Radar
- ML Anomaly Detection Display
- Market Predictions with Moving Averages
- Risk & Opportunity Classification
✅ Weather Data Scraper for ML Training:
- Open-Meteo API (free historical data)
- NASA FIRMS (fire/heat detection)
- All 25 districts coverage
- Year-wise CSV export for model training
✅ Operational Dashboard Metrics:
- Logistics Friction: Average confidence of mobility/social domain risk events
- Compliance Volatility: Average confidence of political domain risks
- Market Instability: Average confidence of market/economical domain risks
- Opportunity Index: Average confidence of opportunity-classified events
✅ Multi-District Province-Aware Event Categorization:
- Events mentioning provinces are displayed in all constituent districts
- Supports: Western, Southern, Central, Northern, Eastern, Sabaragamuwa, Uva, North Western, North Central provinces
- Both frontend (MapView, DistrictInfoPanel) and backend are synchronized
✅ 3-Tier Storage Architecture with Deduplication:
- Tier 1: SQLite - Fast hash-based exact match (microseconds)
- Tier 2: ChromaDB - Semantic similarity search with sentence transformers (milliseconds)
- Tier 3: Neo4j Aura - Knowledge graph for event relationships and entity tracking
- Unified
StorageManagerorchestrates all backends - Deduplication prevents duplicate feeds across all domain agents
🏗️ System Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ Roger Combined Graph │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Graph Initiator (Reset) │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │ Fan-Out │
│ ┌────────────┬────────────┼────────────┬────────────┬────────────┐ │
│ ▼ ▼ ▼ ▼ ▼ ▼ │
│ ┌──────┐ ┌──────┐ ┌──────────┐ ┌──────┐ ┌──────────┐ ┌────┐│
│ │Social│ │Econ │ │Political │ │Meteo │ │Intellig- │ │Data││
│ │Agent │ │Agent │ │Agent │ │Agent │ │ence Agent│ │Retr││
│ └──────┘ └──────┘ └──────────┘ └──────┘ └──────────┘ └────┘│
│ │ │ │ │ │ │ │
│ └────────────┴────────────┴────────────┴────────────┴────────────┘ │
│ │ Fan-In │
│ ┌─────────▼──────────┐ │
│ │ Feed Aggregator │ │
│ │ (Rank & Dedupe) │ │
│ └─────────┬──────────┘ │
│ ┌─────────▼──────────┐ │
│ │ Vectorization │ │
│ │ Agent (Optional) │ │
│ └─────────┬──────────┘ │
│ ┌─────────▼──────────┐ │
│ │ Router (Loop/End) │ │
│ └────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
📊 Graph Implementations
1. Combined Agent Graph (combinedAgentGraph.py)
The Mother Graph - Orchestrates all domain agents in parallel.
graph TD
A[Graph Initiator] -->|Fan-Out| B[Social Agent]
A -->|Fan-Out| C[Economic Agent]
A -->|Fan-Out| D[Political Agent]
A -->|Fan-Out| E[Meteorological Agent]
A -->|Fan-Out| F[Intelligence Agent]
A -->|Fan-Out| G[Data Retrieval Agent]
B -->|Fan-In| H[Feed Aggregator]
C --> H
D --> H
E --> H
F --> H
G --> H
H --> I[Data Refresher]
I --> J{Router}
J -->|Loop| A
J -->|End| K[END]
Key Features:
- Custom state reducers for parallel execution
- Feed deduplication with content hashing
- Loop control with configurable intervals
- Real-time WebSocket broadcasting
Architecture Improvements (v2.1):
- Rate Limiting: Domain-specific rate limits prevent anti-bot detection
- Twitter: 15 RPM, LinkedIn: 10 RPM, News: 60 RPM
- Thread-safe semaphores for max concurrent requests
- Error Handling: Per-agent try/catch prevents cascading failures
- Failed agents return empty results, others continue
- Non-Blocking Refresh: 60-second cycle with interruptible sleep
threading.Event.wait()instead of blockingtime.sleep()
Storage Data Flow
┌─────────────────────────────────────────────────────────────────────────────┐
│ DOMAIN AGENTS (Parallel) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Social │ │Political │ │Economic │ │ Meteo │ │ Intelligence │ │
│ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └──────┬───────┘ │
│ └────────────┴────────────┴────────────┴──────────────┘ │
│ │ Fan-In │
│ ┌────────────▼─────────────┐ │
│ │ CombinedAgentNode │ │
│ │ (LLM Filter + Rank) │ │
│ └────────────┬─────────────┘ │
└─────────────────────────────────┼───────────────────────────────────────────┘
│
┌─────────────▼──────────────┐
│ StorageManager │
│ (3-Tier Deduplication) │
└─────────────┬──────────────┘
┌───────────────────────┼──────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────────────┐
│ SQLite │ │ ChromaDB │ │ Neo4j Aura │
│ (Fast Cache) │ │ (Vector Store) │ │ (Knowledge Graph) │
│ ───────────── │ │ ────────────── │ │ ─────────────────── │
│ Hash-based │ │ Semantic search │ │ Event relationships │
│ Exact match │ │ Similarity 0.85 │ │ Domain nodes │
│ ~microseconds │ │ ~milliseconds │ │ Entity tracking │
└─────────────────┘ └──────────────────┘ └─────────────────────────┘
2. Political Agent Graph (politicalAgentGraph.py)
3-Module Hybrid Architecture
| Module | Description | Sources |
|---|---|---|
| Official Sources | Government data | Gazette, Parliament Minutes |
| Social Media | Political sentiment | Twitter, Facebook, Reddit (National + 25 Districts) |
| Feed Generation | LLM Processing | Categorize → Summarize → Format |
┌─────────────────────────────────────────────┐
│ Module 1: Official │ Module 2: Social │
│ ┌─────────────────┐ │ ┌───────────────┐ │
│ │ Gazette │ │ │ National │ │
│ │ Parliament │ │ │ Districts (25)│ │
│ └─────────────────┘ │ │ World Politics│ │
│ │ └───────────────┘ │
└────────────┬───────────┴────────┬──────────┘
│ Fan-In │
▼ ▼
┌────────────────────────────┐
│ Module 3: Feed Generation │
│ Categorize → LLM → Format │
└────────────────────────────┘
3. Economic Agent Graph (economicalAgentGraph.py)
Market Intelligence & Technical Analysis
| Component | Description |
|---|---|
| Stock Collector | CSE market data (200+ stocks) |
| Technical Analyzer | SMA, EMA, RSI, MACD |
| Trend Detector | Bullish/Bearish signals |
| Feed Generator | Risk/Opportunity classification |
Indicators Calculated:
- Simple Moving Average (SMA-20, SMA-50)
- Exponential Moving Average (EMA-12, EMA-26)
- Relative Strength Index (RSI)
- MACD with Signal Line
4. Meteorological Agent Graph (meteorologicalAgentGraph.py)
Weather & Disaster Monitoring + FloodWatch Integration
┌─────────────────────────────────────┐
│ DMC Weather Collector │
│ (Daily forecasts, 25 districts) │
└─────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ RiverNet Data Collector │
│ (River levels, flood monitoring) │
└─────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ FloodWatch Historical Data │
│ (30-year climate analysis) │
└─────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ National Threat Calculator │
│ (Aggregated flood risk 0-100) │
└─────────────┬───────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Alert Generator │
│ (Severity classification) │
└─────────────────────────────────────┘
Alert Levels:
- 🟢 Normal: Standard conditions
- 🟡 Advisory: Watch for developments
- 🟠 Warning: Take precautions
- 🔴 Critical: Immediate action required
FloodWatch Features:
| Feature | Description |
|---|---|
| Historical Analysis | 30-year climate data (1995-2025) |
| Decadal Comparison | 3 periods: 1995-2004, 2005-2014, 2015-2025 |
| National Threat Score | 0-100 aggregated risk from rivers + alerts + season |
| High-Risk Periods | May-Jun (SW Monsoon), Oct-Nov (NE Monsoon) |
5. Social Agent Graph (socialAgentGraph.py)
Multi-Platform Social Media Monitoring
| Platform | Data Source | Coverage |
|---|---|---|
| PRAW API | r/srilanka, r/colombo | |
| Twitter/X | Nitter scraping | #SriLanka, #Colombo |
| Profile scraping | News pages | |
| Threads | Meta API | Trending topics |
| BlueSky | AT Protocol | Political discourse |
6. Intelligence Agent Graph (intelligenceAgentGraph.py)
Brand & Threat Monitoring + User-Configurable Targets
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Brand Monitor │ │ Threat Scanner │ │ User Targets │
│ - Company news │ │ - Security │ │ - Custom keys │
│ - Competitor │ │ - Compliance │ │ - User profiles │
│ - Market share │ │ - Geopolitical │ │ - Products │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
└──────────────────────┼──────────────────────┘
▼
┌─────────────────────┐
│ Intelligence Report │
│ (Priority ranked) │
└─────────────────────┘
User-Configurable Monitoring: Users can define custom monitoring targets via the frontend settings panel or API:
| Config Type | Description | Example |
|---|---|---|
| Keywords | Custom search terms | "Colombo Port", "BOI Investment" |
| Products | Products to track | "iPhone 15", "Samsung Galaxy" |
| Profiles | Social media accounts | @CompetitorX (Twitter), CompanyY (Facebook) |
API Endpoints:
# Get current config
GET /api/intel/config
# Update full config
POST /api/intel/config
Body: {"user_keywords": ["keyword1"], "user_profiles": {"twitter": ["@account"]}, "user_products": ["Product"]}
# Add single target
POST /api/intel/config/add?target_type=keyword&value=Colombo+Port
# Remove target
DELETE /api/intel/config/remove?target_type=profile&value=CompetitorX&platform=twitter
Config File: src/config/intel_config.json
7. DATA Retrieval Agent Graph (dataRetrievalAgentGraph.py)
Web Scraping Orchestrator
Scraping Tools Available:
scrape_news_site- Generic news scraperscrape_cse_live- CSE stock pricesscrape_official_data- Government portalsscrape_social_media- Multi-platform
Anti-Bot Features:
- Random delays (1-3s)
- User-agent rotation
- Retry with exponential backoff
- Headless browser fallback
8. Vectorization Agent Graph (vectorizationAgentGraph.py)
6-Step Multilingual NLP Pipeline with Anomaly + Trending Detection
┌─────────────────────────────────────────────────┐
│ Step 1: Language Detection │
│ FastText + Unicode script analysis │
│ Supports: English, Sinhala (සිංහල), Tamil (தமிழ்)│
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Step 2: Text Vectorization │
│ ┌─────────────┬─────────────┬─────────────────┐ │
│ │ DistilBERT │ SinhalaBERTo│ Tamil-BERT │ │
│ │ (English) │ (Sinhala) │ (Tamil) │ │
│ └─────────────┴─────────────┴─────────────────┘ │
│ Output: 768-dim vector per text │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Step 3: Anomaly Detection (Isolation Forest) │
│ - English: ML model inference │
│ - Sinhala/Tamil: Skipped (incompatible vectors) │
│ - Outputs anomaly_score (0-1) │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Step 4: Trending Detection │
│ - Entity extraction (hashtags, proper nouns) │
│ - Momentum: current_hour / avg_last_6_hours │
│ - Spike alerts when momentum > 3x │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Step 5: Expert Summary (GroqLLM) │
│ - Opportunity & threat identification │
│ - Sentiment analysis │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Step 6: Format Output │
│ - Includes anomaly + trending in domain_insights│
└─────────────────────────────────────────────────┘
Trending Detection API Endpoints:
| Endpoint | Method | Description |
|---|---|---|
/api/trending |
GET | Get trending topics & spike alerts |
/api/trending/topic/{topic} |
GET | Get hourly history for a topic |
/api/trending/record |
POST | Record a topic mention (testing) |
10. Weather Prediction Pipeline (models/weather-prediction/)
LSTM-Based Multi-District Weather Forecasting
┌─────────────────────────────────────────────────┐
│ Data Source: Tutiempo.net (21 stations) │
│ Historical data since 1944 │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ LSTM Neural Network │
│ ┌─────────────────────────────────────────────┐ │
│ │ Input: 30-day sequence (11 features) │ │
│ │ Layer 1: LSTM(64) + BatchNorm + Dropout │ │
│ │ Layer 2: LSTM(32) + BatchNorm + Dropout │ │
│ │ Output: Dense(3) → temp_max, temp_min, rain │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Severity Classifier │
│ - Combines temp, rainfall, flood risk │
│ - Outputs: normal/advisory/warning/critical │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Output: 25 District Predictions │
│ - Temperature (high/low °C) │
│ - Rainfall (mm + probability) │
│ - Flood risk (integrated with RiverNet) │
└─────────────────────────────────────────────────┘
Usage:
# Run full pipeline
cd models/weather-prediction
python main.py --mode full
# Just predictions
python main.py --mode predict
# Train specific station
python main.py --mode train --station COLOMBO
11. Currency Prediction Pipeline (models/currency-volatility-prediction/)
GRU-Based USD/LKR Exchange Rate Forecasting
┌─────────────────────────────────────────────────┐
│ Data Sources (yfinance) │
│ - USD/LKR exchange rate │
│ - CSE stock index (correlation) │
│ - Gold, Oil prices (global factors) │
│ - USD strength index │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Feature Engineering (25+ features) │
│ - SMA, EMA, RSI, MACD, Bollinger Bands │
│ - Volatility, Momentum indicators │
│ - Temporal encoding (day/month cycles) │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ GRU Neural Network (8GB RAM optimized) │
│ ┌─────────────────────────────────────────────┐ │
│ │ Input: 30-day sequence │ │
│ │ Layer 1: GRU(64) + BatchNorm + Dropout │ │
│ │ Layer 2: GRU(32) + BatchNorm + Dropout │ │
│ │ Output: Dense(1) → next_day_rate │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Output: USD/LKR Prediction │
│ - Current & predicted rate │
│ - Change % and direction │
│ - Volatility classification (low/medium/high) │
└─────────────────────────────────────────────────┘
Usage:
# Run full pipeline
cd models/currency-volatility-prediction
python main.py --mode full
# Just predict
python main.py --mode predict
# Train GRU model
python main.py --mode train --epochs 100
12. RAG Chatbot (src/rag.py)
Chat-History Aware Intelligence Q&A
┌─────────────────────────────────────────────────┐
│ MultiCollectionRetriever │
│ - Connects to ChromaDB intelligence collection │
│ - Roger_feeds (all agent domain feeds) │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Question Reformulation (History-Aware) │
│ - Uses last 3-5 exchanges for context │
│ - Reformulates follow-up questions │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Groq LLM (llama-3.1-70b-versatile) │
│ - RAG with source citations │
│ - Domain-specific analysis │
└─────────────────────────────────────────────────┘
Usage:
# CLI mode
python src/rag.py
# Or via API
curl -X POST http://localhost:8000/api/rag/chat \
-H "Content-Type: application/json" \
-d '{"message": "What are the latest political events?"}'
🤖 ML Anomaly Detection Pipeline
Located in models/anomaly-detection/
Pipeline Components
| Component | File | Description |
|---|---|---|
| Data Ingestion | data_ingestion.py |
SQLite + CSV fetching |
| Data Validation | data_validation.py |
Schema-based validation |
| Data Transformation | data_transformation.py |
Language detection + BERT vectorization |
| Model Trainer | model_trainer.py |
Optuna + MLflow training |
Clustering Models
| Model | Type | Use Case |
|---|---|---|
| DBSCAN | Density-based | Noise-robust clustering |
| KMeans | Centroid-based | Fast, fixed k clusters |
| HDBSCAN | Hierarchical density | Variable density clusters |
| Isolation Forest | Anomaly detection | Outlier identification |
| LOF | Local outlier | Density-based anomalies |
Training with Optuna
# Hyperparameter optimization
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=50)
MLflow Tracking
mlflow.set_tracking_uri("https://dagshub.com/...")
mlflow.log_params(best_params)
mlflow.log_metrics(metrics)
mlflow.sklearn.log_model(model, "model")
🌧️ Weather Data Scraper (scripts/scrape_weather_data.py)
Historical weather data collection for ML model training
Data Sources
| Source | API Key? | Data Available |
|---|---|---|
| Open-Meteo | ❌ Free | Historical weather since 1940 |
| NASA FIRMS | ✅ Optional | Fire/heat spot detection |
Collected Weather Variables
temperature_2m_max/min/meanprecipitation_sum,rain_sumprecipitation_hourswind_speed_10m_max,wind_gusts_10m_maxwind_direction_10m_dominant
Usage
# Scrape last 30 days (default)
python scripts/scrape_weather_data.py
# Scrape specific date range
python scripts/scrape_weather_data.py --start 2020-01-01 --end 2024-12-31
# Scrape multiple years for training dataset
python scripts/scrape_weather_data.py --years 2020,2021,2022,2023,2024
# Include fire detection data
python scripts/scrape_weather_data.py --years 2023,2024 --fires
# Hourly resolution (default is daily)
python scripts/scrape_weather_data.py --start 2024-01-01 --end 2024-01-31 --resolution hourly
Output
datasets/weather/
├── weather_daily_2020-01-01_2020-12-31.csv
├── weather_daily_2021-01-01_2021-12-31.csv
├── weather_combined.csv (merged file)
└── fire_detections_20241207.csv
Coverage
All 25 Sri Lankan districts with coordinates:
- Colombo, Gampaha, Kalutara, Kandy, Matale, Nuwara Eliya
- Galle, Matara, Hambantota, Jaffna, Kilinochchi, Mannar
- Vavuniya, Mullaitivu, Batticaloa, Ampara, Trincomalee
- Kurunegala, Puttalam, Anuradhapura, Polonnaruwa
- Badulla, Monaragala, Ratnapura, Kegalle
🚀 Quick Start
Prerequisites
- Python 3.11+
- Node.js 18+
- Docker Desktop (for Airflow)
- Groq API Key
Installation
# 1. Clone repository
git clone <your-repo>
cd Roger-Final
# 2. Create virtual environment
python -m venv .venv
source .venv/bin/activate # Linux/Mac
.\.venv\Scripts\activate # Windows
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure environment
cp .env.template .env
# Edit .env with your API keys
# 5. Download ML models
python models/anomaly-detection/download_models.py
# 6. Launch all services
./start_services.sh # Linux/Mac
.\start_services.ps1 # Windows
🔧 API Endpoints
REST API (FastAPI - Port 8000)
| Endpoint | Method | Description |
|---|---|---|
/api/status |
GET | System health |
/api/dashboard |
GET | Risk metrics |
/api/feed |
GET | Latest events |
/api/feeds |
GET | All feeds with pagination |
/api/feeds/by_district |
GET | Feeds filtered by district |
/api/rivernet |
GET | River monitoring data |
/api/predict |
POST | Run anomaly predictions |
/api/anomalies |
GET | Get anomalous feeds |
/api/model/status |
GET | ML model status |
/api/weather/predictions |
GET | All district forecasts |
/api/weather/predictions/{district} |
GET | Single district |
/api/weather/model/status |
GET | Weather model info |
/api/weather/historical |
GET | 30-year climate analysis |
/api/weather/threat |
GET | National flood threat score |
/api/currency/prediction |
GET | USD/LKR next-day forecast |
/api/currency/history |
GET | Historical rates |
/api/currency/model/status |
GET | Currency model info |
/api/stocks/predictions |
GET | All CSE stock forecasts |
/api/stocks/predictions/{symbol} |
GET | Single stock prediction |
/api/stocks/model/status |
GET | Stock models info |
/api/rag/chat |
POST | Chat with RAG |
/api/rag/stats |
GET | RAG system stats |
/api/rag/clear |
POST | Clear chat history |
/api/power |
GET | CEB power/load shedding status |
/api/fuel |
GET | Current fuel prices |
/api/economy |
GET | CBSL economic indicators |
/api/health |
GET | Health alerts & dengue data |
/api/commodities |
GET | Essential goods prices |
/api/water |
GET | Water supply disruptions |
WebSocket
ws://localhost:8000/ws- Real-time updates
⏰ Airflow Orchestration
DAG: anomaly_detection_training
start → check_records → data_ingestion → data_validation
→ data_transformation → model_training → end
Triggers:
- Batch threshold: 1000 new records
- Daily fallback: Every 24 hours
Access Dashboard:
cd models/anomaly-detection
astro dev start
# Open http://localhost:8080
DAG: weather_prediction_daily
ingest_data → train_models → generate_predictions → publish_predictions
Schedule: Daily at 4:00 AM IST
Tasks:
- Scrape Tutiempo.net for latest data
- Train LSTM models (MLflow tracked)
- Generate 25-district predictions
- Save to JSON for API
DAG: currency_prediction_daily
ingest_data → train_model → generate_prediction → publish_prediction
Schedule: Daily at 4:00 AM IST
Tasks:
- Fetch USD/LKR + indicators from yfinance
- Train GRU model (MLflow tracked)
- Generate next-day prediction
- Save to JSON for API
📁 Project Structure
Roger-Ultimate/
├── src/
│ ├── graphs/ # LangGraph definitions
│ │ ├── combinedAgentGraph.py # Mother graph
│ │ ├── politicalAgentGraph.py
│ │ ├── economicalAgentGraph.py
│ │ ├── meteorologicalAgentGraph.py
│ │ ├── socialAgentGraph.py
│ │ ├── intelligenceAgentGraph.py
│ │ ├── dataRetrievalAgentGraph.py
│ │ └── vectorizationAgentGraph.py # 5-step with anomaly detection
│ ├── nodes/ # Agent implementations
│ ├── states/ # State definitions
│ ├── llms/ # LLM configurations
│ ├── storage/ # ChromaDB, SQLite, Neo4j stores
│ ├── rag.py # RAG chatbot
│ └── utils/
│ └── utils.py # Tools incl. FloodWatch
├── scripts/
│ └── scrape_weather_data.py # Weather data scraper
├── models/
│ ├── anomaly-detection/ # ML Anomaly Pipeline
│ │ ├── src/
│ │ │ ├── components/ # Pipeline stages
│ │ │ ├── entity/ # Config/Artifact classes
│ │ │ ├── pipeline/ # Orchestrators
│ │ │ └── utils/ # Vectorizer, metrics
│ │ ├── dags/ # Airflow DAGs
│ │ ├── data_schema/ # Validation schemas
│ │ ├── output/ # Trained models
│ │ └── models_cache/ # Downloaded BERT models
│ ├── weather-prediction/ # Weather ML Pipeline
│ │ ├── src/components/ # data_ingestion, model_trainer, predictor
│ │ ├── dags/ # weather_prediction_dag.py (4 AM)
│ │ ├── artifacts/ # Trained LSTM models (.h5)
│ │ └── main.py # CLI entry point
│ └── currency-volatility-prediction/ # Currency ML Pipeline
│ ├── src/components/ # data_ingestion, model_trainer, predictor
│ ├── dags/ # currency_prediction_dag.py (4 AM)
│ ├── artifacts/ # Trained GRU model
│ └── main.py # CLI entry point
├── datasets/
│ └── weather/ # Scraped weather CSVs
├── frontend/
│ └── app/
│ ├── components/
│ │ ├── dashboard/
│ │ │ ├── AnomalyDetection.tsx
│ │ │ ├── WeatherPredictions.tsx
│ │ │ ├── CurrencyPrediction.tsx
│ │ │ ├── NationalThreatCard.tsx # Flood threat score
│ │ │ ├── HistoricalIntel.tsx # 30-year climate
│ │ │ └── ...
│ │ ├── map/
│ │ │ ├── MapView.tsx
│ │ │ └── SatelliteView.tsx # Windy.com embed
│ │ ├── FloatingChatBox.tsx # RAG chat UI
│ │ └── ...
│ └── pages/
│ └── Index.tsx # 7 tabs incl. SATELLITE
├── main.py # FastAPI backend
├── start.sh # Startup script
└── requirements.txt
🔐 Environment Variables
# LLM
GROQ_API_KEY=your_groq_key
# Neo4j (Knowledge Graph)
NEO4J_URI=neo4j+s://your-instance.databases.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password
NEO4J_ENABLED=true
NEO4J_DATABASE=neo4j
# ChromaDB (Vector Store)
CHROMADB_PATH=./data/chromadb
CHROMADB_COLLECTION=Roger_feeds
CHROMADB_SIMILARITY_THRESHOLD=0.85
# SQLite (Fast Cache)
SQLITE_DB_PATH=./data/cache/feeds.db
# MLflow (DagsHub)
MLFLOW_TRACKING_URI=https://dagshub.com/...
MLFLOW_TRACKING_USERNAME=...
MLFLOW_TRACKING_PASSWORD=...
# Pipeline
BATCH_THRESHOLD=1000
🧪 Testing Framework
Industry-level testing infrastructure for the agentic AI system.
Test Structure
tests/
├── conftest.py # Pytest fixtures and configuration
├── unit/ # Unit tests for individual components
│ └── test_utils.py
├── integration/ # Multi-component integration tests
│ └── test_agent_routing.py
├── evaluation/ # LLM-as-Judge evaluation tests
│ ├── agent_evaluator.py # Evaluation harness
│ ├── adversarial_tests.py # Prompt injection & edge cases
│ └── golden_datasets/
│ └── expected_responses.json
└── e2e/ # End-to-end workflow tests
└── test_full_pipeline.py
LangSmith Integration
Automatic tracing for all agent decisions when LANGSMITH_API_KEY is set.
# Add to .env
LANGSMITH_API_KEY=your_langsmith_api_key
LANGSMITH_PROJECT=roger-intelligence # Optional, defaults to 'roger-intelligence'
View traces: smith.langchain.com
Running Tests
# Run all tests
python run_tests.py
# Run specific test suites
python run_tests.py --unit # Unit tests only
python run_tests.py --adversarial # Security/adversarial tests
python run_tests.py --eval # LLM-as-Judge evaluation
python run_tests.py --e2e # End-to-end tests
# With coverage report
python run_tests.py --coverage
# Enable LangSmith tracing in tests
python run_tests.py --with-langsmith
Agent Evaluation Harness
The agent_evaluator.py implements the LLM-as-Judge pattern:
| Metric | Description |
|---|---|
| Tool Selection Accuracy | Did the agent use the correct tools? |
| Response Quality | Is the response relevant and coherent? |
| BLEU Score | N-gram text similarity (0-1, higher = better match) |
| Hallucination Detection | Did the agent fabricate information? |
| Graceful Degradation | Does it handle failures properly? |
# Run standalone evaluator
python tests/evaluation/agent_evaluator.py
Adversarial Testing
Tests for security and robustness:
| Test Category | Description |
|---|---|
| Prompt Injection | Ignore instructions, jailbreak, context switching |
| Out-of-Domain | Non-SL queries, illegal requests, impossible questions |
| Malformed Input | Empty, XSS, SQL injection, unicode flood |
| Graceful Degradation | API timeouts, empty responses, rate limiting |
CI/CD Pipeline
GitHub Actions workflow (.github/workflows/test.yml):
on: [push, pull_request]
jobs:
unit-tests: # Runs on every push
adversarial-tests: # Security tests on every push
evaluation-tests: # LLM evaluation on main branch only
lint: # Code quality checks
Required Secrets:
LANGSMITH_API_KEY- For evaluation test loggingGROQ_API_KEY- For LLM-based evaluation
🐛 Troubleshooting
FastText won't install on Windows
# Use pre-built wheel instead
pip install fasttext-wheel
BERT models downloading slowly
# Pre-download all models
python models/anomaly-detection/download_models.py
Airflow not starting
# Ensure Docker is running
docker info
# Initialize Astro project
cd models/anomaly-detection
astro dev init
astro dev start
NumPy 2.0 / ChromaDB compatibility error
# If you see "A module that was compiled using NumPy 1.x cannot be run in NumPy 2.x"
pip install "numpy<2.0"
# Or upgrade chromadb to latest
pip install --upgrade chromadb
Keras model loading error ("Could not locate function 'mse'")
# If currency/weather models fail to load with Keras 3.x
# Retrain the model - it will save in .keras format automatically
cd models/currency-volatility-prediction
python main.py --mode train
# Or for weather
cd models/weather-prediction
python main.py --mode train
📄 License
MIT License - Built for Production
🙏 Acknowledgments
- Groq - High-speed LLM inference
- LangGraph - Agent orchestration
- HuggingFace - SinhalaBERTo, Tamil-BERT, DistilBERT
- Optuna - Hyperparameter optimization
- MLflow - Experiment tracking
- Sri Lankan government for open data sources