--- title: Pravaah - Ocean Hazard Detection System emoji: 🌊 colorFrom: blue colorTo: green sdk: docker pinned: false license: mit short_description: AI-powered system to detect ocean hazards --- # 🌊 Ocean Hazard Detection System An AI-powered system that analyzes social media posts to detect ocean-related hazards in real-time. This system uses advanced natural language processing to identify hazardous tweets, translate them to English, analyze sentiment, and extract location information. ## 🚀 Features - **Multilingual Support**: Analyzes tweets in 20+ Indian languages including Hindi, Bengali, Tamil, Telugu, Marathi, Gujarati, and English - **Hazard Classification**: Uses XLM-RoBERTa zero-shot classification to identify ocean hazards - **Sentiment Analysis**: Analyzes emotional context using GoEmotions model - **Named Entity Recognition**: Extracts hazard types and locations from text - **Real-time Processing**: Processes tweets from Indian coastal regions - **Database Storage**: Stores hazardous tweets for tracking and analysis ## 🔍 What It Detects ### Hazard Types - Floods and tsunamis - Cyclones and storm surges - High tides and waves - Coastal flooding and erosion - Rip currents and marine debris - Water discoloration and algal blooms - Marine pollution ### Geographic Coverage - **Major Cities**: Mumbai, Chennai, Kolkata, Vizag, Puri - **States**: Odisha, Kerala, Gujarat, Goa, Andhra Pradesh, West Bengal - **Water Bodies**: Bay of Bengal, Arabian Sea ## 🛠️ Technical Stack - **AI Models**: - DeBERTa-v3 for hazard classification - Helsinki-NLP for translation - GoEmotions for sentiment analysis - DistilBERT NER for location extraction - **Backend**: FastAPI + Gradio - **Database**: PostgreSQL - **Languages**: Python 3.9+ ## 📊 How It Works 1. **Tweet Collection**: Scrapes tweets using Twitter API with hazard and location keywords 2. **Translation**: Translates ALL tweets to English for consistent processing (more efficient) 3. **Hazard Classification**: Uses zero-shot learning on translated text to classify as hazardous or safe 4. **Sentiment Analysis**: Analyzes emotional context (panic, calm, confusion, neutral) for hazardous tweets 5. **Entity Extraction**: Identifies specific hazard types and locations from translated text 6. **Database Storage**: Stores hazardous tweets with metadata for tracking ## 🚀 Usage ### Web Interface (Gradio) 1. **Set Tweet Limit**: Choose how many tweets to analyze (1-50) 2. **Click Analyze**: The system will process tweets and show results 3. **View Results**: See hazardous tweets with sentiment, location, and hazard type 4. **Export Data**: Download complete analysis as JSON ### API Endpoints (FastAPI) #### **POST /analyze** Analyze tweets for ocean hazards ```bash # Basic analysis curl -X POST "http://localhost:8000/analyze" \ -H "Content-Type: application/json" \ -d '{"limit": 20}' # Keyword-based search curl -X POST "http://localhost:8000/analyze" \ -H "Content-Type: application/json" \ -d '{"limit": 20, "hazard_type": "tsunami", "location": "Mumbai", "days_back": 2}' # Custom query curl -X POST "http://localhost:8000/analyze" \ -H "Content-Type: application/json" \ -d '{"limit": 20, "query": "flood OR tsunami"}' ``` #### **GET /hazardous-tweets** Get stored hazardous tweets ```bash curl "http://localhost:8000/hazardous-tweets?limit=50&offset=0" ``` #### **GET /keywords/hazards** Get available hazard types for keyword search ```bash curl "http://localhost:8000/keywords/hazards" ``` #### **GET /keywords/locations** Get available locations for keyword search ```bash curl "http://localhost:8000/keywords/locations" ``` #### **GET /stats** Get analysis statistics ```bash curl "http://localhost:8000/stats" ``` #### **GET /health** Health check endpoint ```bash curl "http://localhost:8000/health" ``` ### API Documentation - **Swagger UI**: `http://localhost:8000/docs` - **ReDoc**: `http://localhost:8000/redoc` ## 🔧 Environment Variables The system requires the following environment variables: ```bash # Twitter API (required) TWITTER_API_KEY=your_twitter_api_key # PostgreSQL Database (optional for demo) PGHOST=localhost PGPORT=5432 PGDATABASE=postgres PGUSER=postgres PGPASSWORD=your_password ``` ## 📈 Use Cases - **Emergency Response**: Early detection of ocean hazards for rapid response - **Environmental Monitoring**: Track marine pollution and coastal issues - **Research**: Analyze public sentiment about ocean-related events - **Policy Making**: Data-driven insights for coastal management policies ## 🔬 Model Details - **Classification Model**: `cross-encoder/nli-deberta-v3-base` - **Translation Model**: Helsinki-NLP OPUS-MT models - **Sentiment Model**: Google GoEmotions - **NER**: DistilBERT NER with keyword-based fallback ## 📝 License This project is licensed under the MIT License - see the LICENSE file for details. ## 🤝 Contributing Contributions are welcome! Please feel free to submit a Pull Request. ## 📞 Support For support, please open an issue in the GitHub repository. --- **Note**: This is a demonstration system. In production, it would process real-time tweets and integrate with emergency response systems.