GitHub Actions
Clean sync from GitHub - no large files in history
aca8ab4

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

Quick Start Guide

Installation & Setup (5 minutes)

1. Install Dependencies

pip install -r requirements.txt

2. Configure Azure OpenAI

Create a .env file with your Azure OpenAI credentials:

cp .env.example .env

Edit .env:

AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-api-key-here
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini

3. Run the Application

python app.py

Visit http://localhost:7860 in your browser.

First Query

Try this example query:

Research Question: "What are the latest advances in multi-agent reinforcement learning?"
Category: cs.AI - Artificial Intelligence
Number of Papers: 3

Click "Analyze Papers" and wait ~1-2 minutes.

Expected Output

You should see:

  1. Papers Tab: Table with 3 retrieved papers
  2. Analysis Tab: Detailed analysis of each paper
  3. Synthesis Tab:
    • Executive summary
    • Consensus findings (green highlights)
    • Contradictions (yellow highlights)
    • Research gaps
  4. Citations Tab: APA-formatted references
  5. Stats Tab: Processing time and cost (~$0.20-0.40)

Troubleshooting

Error: "No module named 'xyz'"

pip install -r requirements.txt --upgrade

Error: "Azure OpenAI authentication failed"

  • Check your .env file has correct credentials
  • Verify your Azure OpenAI deployment name matches your actual deployment

Error: "Failed to download paper"

  • Some arXiv papers may have download issues
  • Try a different query or category

Error: "ChromaDB error"

rm -rf data/chroma_db/
# Restart the app

Architecture Overview

User Query
    ↓
Retriever Agent (arXiv search + PDF processing)
    ↓
Analyzer Agent (RAG-based analysis per paper)
    ↓
Synthesis Agent (Cross-paper comparison)
    ↓
Citation Agent (Validation + APA formatting)
    ↓
Gradio UI (4 output tabs)

Key Features

  • Temperature=0: Deterministic outputs
  • RAG Grounding: All claims backed by source text
  • Semantic Caching: Repeated queries use cache
  • Cost Tracking: Real-time cost estimates
  • Error Handling: Graceful failures with user-friendly messages

Performance Benchmarks

Papers Time Cost Chunks
3 ~90s $0.25 ~150
5 ~120s $0.40 ~250
10 ~180s $0.75 ~500

Next Steps

  1. Customize Categories: Edit ARXIV_CATEGORIES in app.py
  2. Adjust Chunking: Modify chunk_size in utils/pdf_processor.py
  3. Change Top-K: Update top_k in rag/retrieval.py
  4. Add Logging: Increase log level in agents for debugging

Deployment to Hugging Face

# 1. Create a new Space on huggingface.co
# 2. Upload all files
# 3. Add secrets in Space settings:
#    - AZURE_OPENAI_ENDPOINT
#    - AZURE_OPENAI_API_KEY
#    - AZURE_OPENAI_DEPLOYMENT_NAME
# 4. Space will auto-deploy

Support

For issues: https://github.com/yourusername/Multi-Agent-Research-Paper-Analysis-System/issues