Spaces:
Sleeping
title: Pravaah - Ocean Hazard Detection System
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: mit
short_description: AI-powered system to detect ocean hazards
π Ocean Hazard Detection System
An AI-powered system that analyzes social media posts to detect ocean-related hazards in real-time. This system uses advanced natural language processing to identify hazardous tweets, translate them to English, analyze sentiment, and extract location information.
π Features
- Multilingual Support: Analyzes tweets in 20+ Indian languages including Hindi, Bengali, Tamil, Telugu, Marathi, Gujarati, and English
- Hazard Classification: Uses XLM-RoBERTa zero-shot classification to identify ocean hazards
- Sentiment Analysis: Analyzes emotional context using GoEmotions model
- Named Entity Recognition: Extracts hazard types and locations from text
- Real-time Processing: Processes tweets from Indian coastal regions
- Database Storage: Stores hazardous tweets for tracking and analysis
π What It Detects
Hazard Types
- Floods and tsunamis
- Cyclones and storm surges
- High tides and waves
- Coastal flooding and erosion
- Rip currents and marine debris
- Water discoloration and algal blooms
- Marine pollution
Geographic Coverage
- Major Cities: Mumbai, Chennai, Kolkata, Vizag, Puri
- States: Odisha, Kerala, Gujarat, Goa, Andhra Pradesh, West Bengal
- Water Bodies: Bay of Bengal, Arabian Sea
π οΈ Technical Stack
- AI Models:
- DeBERTa-v3 for hazard classification
- Helsinki-NLP for translation
- GoEmotions for sentiment analysis
- DistilBERT NER for location extraction
- Backend: FastAPI + Gradio
- Database: PostgreSQL
- Languages: Python 3.9+
π How It Works
- Tweet Collection: Scrapes tweets using Twitter API with hazard and location keywords
- Translation: Translates ALL tweets to English for consistent processing (more efficient)
- Hazard Classification: Uses zero-shot learning on translated text to classify as hazardous or safe
- Sentiment Analysis: Analyzes emotional context (panic, calm, confusion, neutral) for hazardous tweets
- Entity Extraction: Identifies specific hazard types and locations from translated text
- Database Storage: Stores hazardous tweets with metadata for tracking
π Usage
Web Interface (Gradio)
- Set Tweet Limit: Choose how many tweets to analyze (1-50)
- Click Analyze: The system will process tweets and show results
- View Results: See hazardous tweets with sentiment, location, and hazard type
- Export Data: Download complete analysis as JSON
API Endpoints (FastAPI)
POST /analyze
Analyze tweets for ocean hazards
# Basic analysis
curl -X POST "http://localhost:8000/analyze" \
-H "Content-Type: application/json" \
-d '{"limit": 20}'
# Keyword-based search
curl -X POST "http://localhost:8000/analyze" \
-H "Content-Type: application/json" \
-d '{"limit": 20, "hazard_type": "tsunami", "location": "Mumbai", "days_back": 2}'
# Custom query
curl -X POST "http://localhost:8000/analyze" \
-H "Content-Type: application/json" \
-d '{"limit": 20, "query": "flood OR tsunami"}'
GET /hazardous-tweets
Get stored hazardous tweets
curl "http://localhost:8000/hazardous-tweets?limit=50&offset=0"
GET /keywords/hazards
Get available hazard types for keyword search
curl "http://localhost:8000/keywords/hazards"
GET /keywords/locations
Get available locations for keyword search
curl "http://localhost:8000/keywords/locations"
GET /stats
Get analysis statistics
curl "http://localhost:8000/stats"
GET /health
Health check endpoint
curl "http://localhost:8000/health"
API Documentation
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
π§ Environment Variables
The system requires the following environment variables:
# Twitter API (required)
TWITTER_API_KEY=your_twitter_api_key
# PostgreSQL Database (optional for demo)
PGHOST=localhost
PGPORT=5432
PGDATABASE=postgres
PGUSER=postgres
PGPASSWORD=your_password
π Use Cases
- Emergency Response: Early detection of ocean hazards for rapid response
- Environmental Monitoring: Track marine pollution and coastal issues
- Research: Analyze public sentiment about ocean-related events
- Policy Making: Data-driven insights for coastal management policies
π¬ Model Details
- Classification Model:
cross-encoder/nli-deberta-v3-base - Translation Model: Helsinki-NLP OPUS-MT models
- Sentiment Model: Google GoEmotions
- NER: DistilBERT NER with keyword-based fallback
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
π Support
For support, please open an issue in the GitHub repository.
Note: This is a demonstration system. In production, it would process real-time tweets and integrate with emergency response systems.