KhalilGuetari commited on
Commit
5aaaef8
·
1 Parent(s): 2a623ac

Deployment on hf spaces

Browse files
.dockerignore ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Git
2
+ .git
3
+ .gitignore
4
+ .gitattributes
5
+
6
+ # Python
7
+ __pycache__/
8
+ *.py[cod]
9
+ *$py.class
10
+ *.so
11
+ .Python
12
+ *.egg-info/
13
+ dist/
14
+ build/
15
+ *.egg
16
+
17
+ # Virtual environments
18
+ .venv/
19
+ venv/
20
+ ENV/
21
+ env/
22
+
23
+ # PDM
24
+ .pdm-python
25
+ .pdm-build/
26
+
27
+ # IDE
28
+ .vscode/
29
+ .idea/
30
+ *.swp
31
+ *.swo
32
+ *~
33
+
34
+ # Testing
35
+ .pytest_cache/
36
+ .coverage
37
+ htmlcov/
38
+ .tox/
39
+
40
+ # Documentation
41
+ docs/_build/
42
+
43
+ # Environment files
44
+ .env
45
+ .envrc
46
+
47
+ # Cache
48
+ cache/
49
+ __pycache__/
50
+ .ruff_cache/
51
+
52
+ # Logs
53
+ *.log
54
+ scripts.log
55
+
56
+ # OS
57
+ .DS_Store
58
+ Thumbs.db
59
+
60
+ # Kiro
61
+ .kiro/
62
+
63
+ # Development files
64
+ scripts/playground/
65
+ tests/
.gitignore CHANGED
@@ -207,4 +207,8 @@ marimo/_lsp/
207
  __marimo__/
208
 
209
  # Cache
210
- cache/
 
 
 
 
 
207
  __marimo__/
208
 
209
  # Cache
210
+ cache/
211
+
212
+ # Docker
213
+ docker-compose.override.yml
214
+ .docker/
.kiro/specs/hf-eda-mcp-server/tasks.md CHANGED
@@ -57,7 +57,7 @@
57
  - Include proper logging and error handling for server operations
58
  - _Requirements: 4.1, 4.2, 4.4_
59
 
60
- - [ ] 5. Implement error handling and validation
61
  - [x] 5.1 Add input validation for all tools
62
  - Validate dataset identifiers and configuration names
63
  - Check split names and sample size parameters
@@ -77,13 +77,13 @@
77
  - _Requirements: 1.1, 2.1, 5.1_
78
 
79
  - [ ] 6. Integration and deployment setup
80
- - [ ] 6.1 Create main entry point and CLI
81
  - Implement main module for running the server
82
  - Add command-line interface for server configuration
83
  - Include help documentation and usage examples
84
  - _Requirements: 4.1, 4.2_
85
 
86
- - [ ] 6.2 Add deployment configuration
87
  - Create configuration for HuggingFace Spaces deployment
88
  - Add Docker configuration for containerized deployment
89
  - Include MCP client configuration examples
 
57
  - Include proper logging and error handling for server operations
58
  - _Requirements: 4.1, 4.2, 4.4_
59
 
60
+ - [x] 5. Implement error handling and validation
61
  - [x] 5.1 Add input validation for all tools
62
  - Validate dataset identifiers and configuration names
63
  - Check split names and sample size parameters
 
77
  - _Requirements: 1.1, 2.1, 5.1_
78
 
79
  - [ ] 6. Integration and deployment setup
80
+ - [x] 6.1 Create main entry point and CLI
81
  - Implement main module for running the server
82
  - Add command-line interface for server configuration
83
  - Include help documentation and usage examples
84
  - _Requirements: 4.1, 4.2_
85
 
86
+ - [x] 6.2 Add deployment configuration
87
  - Create configuration for HuggingFace Spaces deployment
88
  - Add Docker configuration for containerized deployment
89
  - Include MCP client configuration examples
Dockerfile ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Multi-stage build for hf-eda-mcp server
2
+ FROM python:3.13-slim as builder
3
+
4
+ # Set working directory
5
+ WORKDIR /app
6
+
7
+ # Install system dependencies
8
+ RUN apt-get update && apt-get install -y \
9
+ git \
10
+ && rm -rf /var/lib/apt/lists/*
11
+
12
+ # Install PDM
13
+ RUN pip install --no-cache-dir pdm
14
+
15
+ # Copy dependency files
16
+ COPY pyproject.toml pdm.lock* ./
17
+
18
+ # Install dependencies
19
+ RUN pdm install --prod --no-lock --no-editable
20
+
21
+ # Production stage
22
+ FROM python:3.13-slim
23
+
24
+ # Set working directory
25
+ WORKDIR /app
26
+
27
+ # Install runtime dependencies
28
+ RUN apt-get update && apt-get install -y \
29
+ git \
30
+ && rm -rf /var/lib/apt/lists/*
31
+
32
+ # Copy installed dependencies from builder
33
+ COPY --from=builder /app/.venv /app/.venv
34
+
35
+ # Copy application code
36
+ COPY src/ ./src/
37
+ COPY README.md LICENSE ./
38
+
39
+ # Set environment variables
40
+ ENV PATH="/app/.venv/bin:$PATH"
41
+ ENV PYTHONUNBUFFERED=1
42
+ ENV GRADIO_SERVER_NAME="0.0.0.0"
43
+ ENV GRADIO_SERVER_PORT=7860
44
+
45
+ # Expose Gradio port
46
+ EXPOSE 7860
47
+
48
+ # Health check
49
+ HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
50
+ CMD python -c "import requests; requests.get('http://localhost:7860/health', timeout=5)"
51
+
52
+ # Run the MCP server
53
+ CMD ["python", "-m", "hf_eda_mcp"]
README.md CHANGED
@@ -1,2 +1,55 @@
1
- # hf-eda-mcp
2
- An MCP server providing tools for EDA for HuggingFace datasets
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: HF EDA MCP Server
3
+ emoji: 🔍
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 5.49.1
8
+ app_file: app.py
9
+ pinned: false
10
+ license: apache-2.0
11
+ ---
12
+
13
+ # HF EDA MCP Server
14
+
15
+ An MCP (Model Context Protocol) server that provides tools for Exploratory Data Analysis (EDA) of HuggingFace datasets.
16
+
17
+ ## Features
18
+
19
+ - **Dataset Metadata**: Retrieve comprehensive information about HuggingFace datasets
20
+ - **Dataset Sampling**: Get samples from any dataset split for quick exploration
21
+ - **Feature Analysis**: Perform basic EDA including statistics, missing values, and distributions
22
+
23
+ ## Usage
24
+
25
+ This Space runs as an MCP server that can be accessed by MCP-compatible AI assistants.
26
+
27
+ ### MCP Client Configuration
28
+
29
+ Add this server to your MCP client configuration:
30
+
31
+ ```json
32
+ {
33
+ "mcpServers": {
34
+ "hf-eda-mcp": {
35
+ "url": "https://YOUR-USERNAME-hf-eda-mcp.hf.space/gradio_api/mcp/sse"
36
+ }
37
+ }
38
+ }
39
+ ```
40
+
41
+ Replace `YOUR-USERNAME` with your HuggingFace username.
42
+
43
+ ### Available Tools
44
+
45
+ 1. **get_dataset_metadata**: Get detailed information about a dataset
46
+ 2. **get_dataset_sample**: Retrieve sample rows from a dataset
47
+ 3. **analyze_dataset_features**: Perform exploratory analysis on dataset features
48
+
49
+ ## Authentication
50
+
51
+ For private datasets, set the `HF_TOKEN` secret in your Space settings.
52
+
53
+ ## License
54
+
55
+ Apache License 2.0
app.py ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ HuggingFace Spaces entry point for hf-eda-mcp server.
3
+
4
+ This file is used when deploying to HuggingFace Spaces.
5
+ It imports and launches the main Gradio application.
6
+ """
7
+
8
+ import os
9
+ import sys
10
+
11
+ # Add src directory to path for imports
12
+ sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'src'))
13
+
14
+ from hf_eda_mcp.server import create_gradio_app
15
+
16
+ # Create and launch the Gradio app
17
+ if __name__ == "__main__":
18
+ app = create_gradio_app()
19
+ app.launch(
20
+ server_name="0.0.0.0",
21
+ server_port=7860,
22
+ share=False
23
+ )
CONFIGURATION.md → docs/CONFIGURATION.md RENAMED
File without changes
MCP_USAGE.md → docs/MCP_USAGE.md RENAMED
File without changes
docs/deployment/DEPLOYMENT.md ADDED
@@ -0,0 +1,300 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deployment Guide
2
+
3
+ This guide covers different deployment options for the hf-eda-mcp server.
4
+
5
+ ## Table of Contents
6
+
7
+ - [Local Development](#local-development)
8
+ - [Docker Deployment](#docker-deployment)
9
+ - [HuggingFace Spaces](#huggingface-spaces)
10
+ - [Production Considerations](#production-considerations)
11
+
12
+ ---
13
+
14
+ ## Local Development
15
+
16
+ ### Prerequisites
17
+
18
+ - Python 3.13+
19
+ - PDM (Python package manager)
20
+ - HuggingFace account (optional, for private datasets)
21
+
22
+ ### Setup
23
+
24
+ 1. Clone the repository:
25
+ ```bash
26
+ git clone https://github.com/your-username/hf-eda-mcp.git
27
+ cd hf-eda-mcp
28
+ ```
29
+
30
+ 2. Install dependencies:
31
+ ```bash
32
+ pdm install
33
+ ```
34
+
35
+ 3. Configure environment variables:
36
+ ```bash
37
+ cp config.example.env .env
38
+ # Edit .env and add your HF_TOKEN if needed
39
+ ```
40
+
41
+ 4. Run the server:
42
+ ```bash
43
+ pdm run hf-eda-mcp
44
+ ```
45
+
46
+ The server will start on `http://localhost:7860` with MCP enabled.
47
+
48
+ ---
49
+
50
+ ## Docker Deployment
51
+
52
+ ### Build the Image
53
+
54
+ ```bash
55
+ docker build -t hf-eda-mcp:latest .
56
+ ```
57
+
58
+ ### Run with Docker
59
+
60
+ ```bash
61
+ docker run -d \
62
+ --name hf-eda-mcp-server \
63
+ -p 7860:7860 \
64
+ -e HF_TOKEN=your_token_here \
65
+ -v hf-cache:/app/cache \
66
+ hf-eda-mcp:latest
67
+ ```
68
+
69
+ ### Run with Docker Compose
70
+
71
+ 1. Create a `.env` file with your configuration:
72
+ ```bash
73
+ HF_TOKEN=your_token_here
74
+ ```
75
+
76
+ 2. Start the service:
77
+ ```bash
78
+ docker-compose up -d
79
+ ```
80
+
81
+ 3. View logs:
82
+ ```bash
83
+ docker-compose logs -f
84
+ ```
85
+
86
+ 4. Stop the service:
87
+ ```bash
88
+ docker-compose down
89
+ ```
90
+
91
+ ### Docker Configuration Options
92
+
93
+ Environment variables you can set:
94
+
95
+ - `HF_TOKEN`: HuggingFace API token
96
+ - `GRADIO_SERVER_NAME`: Server host (default: `0.0.0.0`)
97
+ - `GRADIO_SERVER_PORT`: Server port (default: `7860`)
98
+ - `HF_HOME`: Cache directory for HuggingFace
99
+ - `MCP_SERVER_ENABLED`: Enable MCP server (default: `true`)
100
+
101
+ ---
102
+
103
+ ## HuggingFace Spaces
104
+
105
+ ### Deployment Steps
106
+
107
+ 1. **Create a new Space**:
108
+ - Go to https://huggingface.co/spaces
109
+ - Click "Create new Space"
110
+ - Choose "Gradio" as the SDK
111
+ - Select SDK version 5.49.1 or higher
112
+
113
+ 2. **Upload files**:
114
+ ```bash
115
+ # Copy files to Spaces directory
116
+ cp -r src/ spaces/
117
+ cp README.md LICENSE spaces/
118
+
119
+ # Initialize git in spaces directory
120
+ cd spaces
121
+ git init
122
+ git remote add origin https://huggingface.co/spaces/YOUR-USERNAME/hf-eda-mcp
123
+ ```
124
+
125
+ 3. **Configure the Space**:
126
+ - Copy `spaces/README.md` as the Space's README
127
+ - Ensure `spaces/app.py` is set as the app file
128
+ - Add `spaces/requirements.txt` for dependencies
129
+
130
+ 4. **Set secrets** (for private datasets):
131
+ - Go to Space settings
132
+ - Add `HF_TOKEN` as a secret
133
+
134
+ 5. **Deploy**:
135
+ ```bash
136
+ git add .
137
+ git commit -m "Initial deployment"
138
+ git push origin main
139
+ ```
140
+
141
+ ### Space Configuration
142
+
143
+ The Space will automatically:
144
+ - Install dependencies from `requirements.txt`
145
+ - Run `app.py` as the entry point
146
+ - Expose the MCP server at `/gradio_api/mcp/sse`
147
+
148
+ ### Accessing the Space
149
+
150
+ Your MCP server will be available at:
151
+ ```
152
+ https://YOUR-USERNAME-hf-eda-mcp.hf.space/gradio_api/mcp/sse
153
+ ```
154
+
155
+ ---
156
+
157
+ ## Production Considerations
158
+
159
+ ### Security
160
+
161
+ 1. **Authentication**:
162
+ - Use environment variables for sensitive data
163
+ - Never commit tokens to version control
164
+ - Rotate tokens regularly
165
+
166
+ 2. **Access Control**:
167
+ - Consider implementing rate limiting
168
+ - Use HTTPS for all connections
169
+ - Validate all input parameters
170
+
171
+ 3. **Secrets Management**:
172
+ - Use Docker secrets or environment files
173
+ - For Spaces, use the built-in secrets feature
174
+ - Consider using a secrets manager (AWS Secrets Manager, HashiCorp Vault)
175
+
176
+ ### Performance
177
+
178
+ 1. **Caching**:
179
+ - Enable persistent cache volumes
180
+ - Configure appropriate cache sizes
181
+ - Monitor cache hit rates
182
+
183
+ 2. **Resource Limits**:
184
+ - Set memory limits in Docker
185
+ - Configure appropriate timeouts
186
+ - Monitor CPU and memory usage
187
+
188
+ 3. **Scaling**:
189
+ - Use load balancers for multiple instances
190
+ - Consider horizontal scaling for high traffic
191
+ - Monitor response times and adjust resources
192
+
193
+ ### Monitoring
194
+
195
+ 1. **Logging**:
196
+ - Configure structured logging
197
+ - Use log aggregation tools (ELK, Splunk)
198
+ - Monitor error rates
199
+
200
+ 2. **Metrics**:
201
+ - Track request counts and latencies
202
+ - Monitor cache performance
203
+ - Set up alerts for errors
204
+
205
+ 3. **Health Checks**:
206
+ - Implement health check endpoints
207
+ - Configure container health checks
208
+ - Set up uptime monitoring
209
+
210
+ ### Backup and Recovery
211
+
212
+ 1. **Data Backup**:
213
+ - Backup cache volumes regularly
214
+ - Document configuration settings
215
+ - Version control all code
216
+
217
+ 2. **Disaster Recovery**:
218
+ - Document deployment procedures
219
+ - Test recovery processes
220
+ - Maintain rollback capabilities
221
+
222
+ ---
223
+
224
+ ## Deployment Checklist
225
+
226
+ ### Pre-Deployment
227
+
228
+ - [ ] All tests passing
229
+ - [ ] Dependencies up to date
230
+ - [ ] Security scan completed
231
+ - [ ] Documentation updated
232
+ - [ ] Environment variables configured
233
+ - [ ] Secrets properly managed
234
+
235
+ ### Deployment
236
+
237
+ - [ ] Build successful
238
+ - [ ] Health checks passing
239
+ - [ ] MCP endpoints accessible
240
+ - [ ] Tools functioning correctly
241
+ - [ ] Logs being collected
242
+ - [ ] Monitoring configured
243
+
244
+ ### Post-Deployment
245
+
246
+ - [ ] Verify all tools work
247
+ - [ ] Check performance metrics
248
+ - [ ] Monitor error rates
249
+ - [ ] Test with MCP clients
250
+ - [ ] Document any issues
251
+ - [ ] Update runbooks
252
+
253
+ ---
254
+
255
+ ## Troubleshooting
256
+
257
+ ### Common Issues
258
+
259
+ 1. **Server won't start**:
260
+ - Check Python version (3.13+ required)
261
+ - Verify all dependencies installed
262
+ - Check port availability
263
+ - Review logs for errors
264
+
265
+ 2. **MCP connection fails**:
266
+ - Verify server is running
267
+ - Check firewall settings
268
+ - Confirm correct URL/port
269
+ - Test with curl or browser
270
+
271
+ 3. **Dataset access errors**:
272
+ - Verify HF_TOKEN is set
273
+ - Check token permissions
274
+ - Confirm dataset exists
275
+ - Test with public dataset first
276
+
277
+ 4. **Performance issues**:
278
+ - Check cache configuration
279
+ - Monitor resource usage
280
+ - Reduce sample sizes
281
+ - Enable caching
282
+
283
+ ### Getting Help
284
+
285
+ - Check logs: `docker logs hf-eda-mcp-server`
286
+ - Review documentation: See `MCP_USAGE.md`
287
+ - Open an issue: GitHub repository
288
+ - Community support: HuggingFace forums
289
+
290
+ ---
291
+
292
+ ## Next Steps
293
+
294
+ After deployment:
295
+
296
+ 1. Configure MCP clients (see `deployment/mcp-client-examples.md`)
297
+ 2. Test all tools with various datasets
298
+ 3. Set up monitoring and alerts
299
+ 4. Document any custom configurations
300
+ 5. Share your Space with the community!
docs/deployment/QUICKSTART.md ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Quick Start Guide
2
+
3
+ Get hf-eda-mcp running in minutes!
4
+
5
+ ## Choose Your Deployment Method
6
+
7
+ ### 🚀 Option 1: Local Development (Fastest)
8
+
9
+ ```bash
10
+ # Install dependencies
11
+ pdm install
12
+
13
+ # Set up environment (optional for public datasets)
14
+ cp config.example.env .env
15
+ # Edit .env and add HF_TOKEN if needed
16
+
17
+ # Run the server
18
+ pdm run hf-eda-mcp
19
+ ```
20
+
21
+ Server runs at: `http://localhost:7860`
22
+
23
+ ---
24
+
25
+ ### 🐳 Option 2: Docker (Recommended for Production)
26
+
27
+ ```bash
28
+ # Build the image
29
+ docker build -t hf-eda-mcp:latest .
30
+
31
+ # Run the container
32
+ docker run -d \
33
+ --name hf-eda-mcp-server \
34
+ -p 7860:7860 \
35
+ -e HF_TOKEN=your_token_here \
36
+ hf-eda-mcp:latest
37
+ ```
38
+
39
+ Or use Docker Compose:
40
+
41
+ ```bash
42
+ # Create .env file with HF_TOKEN
43
+ echo "HF_TOKEN=your_token_here" > .env
44
+
45
+ # Start the service
46
+ docker-compose up -d
47
+ ```
48
+
49
+ Server runs at: `http://localhost:7860`
50
+
51
+ ---
52
+
53
+ ### ☁️ Option 3: HuggingFace Spaces (Easiest for Sharing)
54
+
55
+ 1. Create a new Gradio Space on HuggingFace
56
+ 2. Copy files from `spaces/` directory to your Space
57
+ 3. Set `HF_TOKEN` as a secret in Space settings (if needed)
58
+ 4. Push to deploy
59
+
60
+ Your server will be at: `https://YOUR-USERNAME-hf-eda-mcp.hf.space`
61
+
62
+ ---
63
+
64
+ ## Connect an MCP Client
65
+
66
+ ### Kiro IDE
67
+
68
+ Add to `.kiro/settings/mcp.json`:
69
+
70
+ ```json
71
+ {
72
+ "mcpServers": {
73
+ "hf-eda-mcp": {
74
+ "command": "pdm",
75
+ "args": ["run", "hf-eda-mcp"],
76
+ "disabled": false
77
+ }
78
+ }
79
+ }
80
+ ```
81
+
82
+ ### Claude Desktop
83
+
84
+ Add to `claude_desktop_config.json`:
85
+
86
+ ```json
87
+ {
88
+ "mcpServers": {
89
+ "hf-eda-mcp": {
90
+ "command": "python",
91
+ "args": ["-m", "hf_eda_mcp"],
92
+ "env": {
93
+ "PYTHONPATH": "/path/to/hf-eda-mcp/src"
94
+ }
95
+ }
96
+ }
97
+ }
98
+ ```
99
+
100
+ ---
101
+
102
+ ## Test the Server
103
+
104
+ ### Using the Web Interface
105
+
106
+ 1. Open `http://localhost:7860` in your browser
107
+ 2. Try the tools with a sample dataset like "squad"
108
+
109
+ ### Using an MCP Client
110
+
111
+ Ask your AI assistant:
112
+
113
+ ```
114
+ "Get metadata for the squad dataset"
115
+ "Show me 5 samples from the train split of squad"
116
+ "Analyze the features of the squad dataset"
117
+ ```
118
+
119
+ ---
120
+
121
+ ## Common Issues
122
+
123
+ **Server won't start?**
124
+ - Check Python version: `python --version` (need 3.13+)
125
+ - Install dependencies: `pdm install`
126
+
127
+ **Can't access private datasets?**
128
+ - Set `HF_TOKEN` in your `.env` file
129
+ - Get token from: https://huggingface.co/settings/tokens
130
+
131
+ **Port 7860 already in use?**
132
+ - Change port: `GRADIO_SERVER_PORT=8080 pdm run hf-eda-mcp`
133
+
134
+ ---
135
+
136
+ ## Next Steps
137
+
138
+ - 📖 Read the full [Deployment Guide](DEPLOYMENT.md)
139
+ - 🔧 See [MCP Client Examples](mcp-client-examples.md)
140
+ - 📚 Check [MCP Usage Documentation](../MCP_USAGE.md)
141
+
142
+ ---
143
+
144
+ ## Need Help?
145
+
146
+ - Check logs: `docker logs hf-eda-mcp-server` (Docker)
147
+ - Review documentation in `docs/`
148
+ - Open an issue on GitHub
docs/deployment/mcp-client-examples.md ADDED
@@ -0,0 +1,295 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MCP Client Configuration Examples
2
+
3
+ This document provides configuration examples for connecting various MCP clients to the hf-eda-mcp server.
4
+
5
+ ## Table of Contents
6
+
7
+ - [Kiro IDE](#kiro-ide)
8
+ - [Claude Desktop](#claude-desktop)
9
+ - [Custom MCP Client](#custom-mcp-client)
10
+ - [Environment Variables](#environment-variables)
11
+
12
+ ---
13
+
14
+ ## Kiro IDE
15
+
16
+ ### Workspace Configuration
17
+
18
+ Create or edit `.kiro/settings/mcp.json` in your workspace:
19
+
20
+ ```json
21
+ {
22
+ "mcpServers": {
23
+ "hf-eda-mcp": {
24
+ "command": "docker",
25
+ "args": [
26
+ "run",
27
+ "--rm",
28
+ "-i",
29
+ "-p", "7860:7860",
30
+ "--env-file", ".env",
31
+ "hf-eda-mcp:latest"
32
+ ],
33
+ "env": {
34
+ "HF_TOKEN": "${HF_TOKEN}"
35
+ },
36
+ "disabled": false,
37
+ "autoApprove": [
38
+ "get_dataset_metadata",
39
+ "get_dataset_sample",
40
+ "analyze_dataset_features"
41
+ ]
42
+ }
43
+ }
44
+ }
45
+ ```
46
+
47
+ ### User-Level Configuration
48
+
49
+ Edit `~/.kiro/settings/mcp.json` for global configuration:
50
+
51
+ ```json
52
+ {
53
+ "mcpServers": {
54
+ "hf-eda-mcp": {
55
+ "command": "pdm",
56
+ "args": ["run", "hf-eda-mcp"],
57
+ "env": {
58
+ "HF_TOKEN": "your_token_here"
59
+ },
60
+ "disabled": false,
61
+ "autoApprove": []
62
+ }
63
+ }
64
+ }
65
+ ```
66
+
67
+ ### Using HuggingFace Spaces
68
+
69
+ ```json
70
+ {
71
+ "mcpServers": {
72
+ "hf-eda-mcp": {
73
+ "url": "https://your-username-hf-eda-mcp.hf.space/gradio_api/mcp/sse",
74
+ "disabled": false,
75
+ "autoApprove": ["get_dataset_metadata"]
76
+ }
77
+ }
78
+ }
79
+ ```
80
+
81
+ ---
82
+
83
+ ## Claude Desktop
84
+
85
+ ### Configuration File Location
86
+
87
+ - **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
88
+ - **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
89
+ - **Linux**: `~/.config/Claude/claude_desktop_config.json`
90
+
91
+ ### Local Server Configuration
92
+
93
+ ```json
94
+ {
95
+ "mcpServers": {
96
+ "hf-eda-mcp": {
97
+ "command": "python",
98
+ "args": ["-m", "hf_eda_mcp"],
99
+ "env": {
100
+ "HF_TOKEN": "your_token_here",
101
+ "PYTHONPATH": "/path/to/hf-eda-mcp/src"
102
+ }
103
+ }
104
+ }
105
+ }
106
+ ```
107
+
108
+ ### Docker Configuration
109
+
110
+ ```json
111
+ {
112
+ "mcpServers": {
113
+ "hf-eda-mcp": {
114
+ "command": "docker",
115
+ "args": [
116
+ "run",
117
+ "--rm",
118
+ "-i",
119
+ "-p", "7860:7860",
120
+ "-e", "HF_TOKEN=your_token_here",
121
+ "hf-eda-mcp:latest"
122
+ ]
123
+ }
124
+ }
125
+ }
126
+ ```
127
+
128
+ ### HuggingFace Spaces Configuration
129
+
130
+ ```json
131
+ {
132
+ "mcpServers": {
133
+ "hf-eda-mcp": {
134
+ "url": "https://your-username-hf-eda-mcp.hf.space/gradio_api/mcp/sse"
135
+ }
136
+ }
137
+ }
138
+ ```
139
+
140
+ ---
141
+
142
+ ## Custom MCP Client
143
+
144
+ ### Python Client Example
145
+
146
+ ```python
147
+ import asyncio
148
+ from mcp import ClientSession, StdioServerParameters
149
+ from mcp.client.stdio import stdio_client
150
+
151
+ async def main():
152
+ # Connect to local server
153
+ server_params = StdioServerParameters(
154
+ command="python",
155
+ args=["-m", "hf_eda_mcp"],
156
+ env={"HF_TOKEN": "your_token_here"}
157
+ )
158
+
159
+ async with stdio_client(server_params) as (read, write):
160
+ async with ClientSession(read, write) as session:
161
+ # Initialize the connection
162
+ await session.initialize()
163
+
164
+ # List available tools
165
+ tools = await session.list_tools()
166
+ print("Available tools:", tools)
167
+
168
+ # Call a tool
169
+ result = await session.call_tool(
170
+ "get_dataset_metadata",
171
+ arguments={"dataset_id": "squad"}
172
+ )
173
+ print("Result:", result)
174
+
175
+ if __name__ == "__main__":
176
+ asyncio.run(main())
177
+ ```
178
+
179
+ ### JavaScript/TypeScript Client Example
180
+
181
+ ```typescript
182
+ import { Client } from "@modelcontextprotocol/sdk/client/index.js";
183
+ import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
184
+
185
+ async function main() {
186
+ const transport = new StdioClientTransport({
187
+ command: "python",
188
+ args: ["-m", "hf_eda_mcp"],
189
+ env: {
190
+ HF_TOKEN: process.env.HF_TOKEN
191
+ }
192
+ });
193
+
194
+ const client = new Client({
195
+ name: "hf-eda-client",
196
+ version: "1.0.0"
197
+ }, {
198
+ capabilities: {}
199
+ });
200
+
201
+ await client.connect(transport);
202
+
203
+ // List tools
204
+ const tools = await client.listTools();
205
+ console.log("Available tools:", tools);
206
+
207
+ // Call a tool
208
+ const result = await client.callTool({
209
+ name: "get_dataset_metadata",
210
+ arguments: {
211
+ dataset_id: "squad"
212
+ }
213
+ });
214
+ console.log("Result:", result);
215
+
216
+ await client.close();
217
+ }
218
+
219
+ main().catch(console.error);
220
+ ```
221
+
222
+ ---
223
+
224
+ ## Environment Variables
225
+
226
+ ### Required Variables
227
+
228
+ - `HF_TOKEN`: HuggingFace API token (optional for public datasets, required for private datasets)
229
+
230
+ ### Optional Variables
231
+
232
+ - `HF_HOME`: Directory for HuggingFace cache (default: `~/.cache/huggingface`)
233
+ - `HF_DATASETS_CACHE`: Directory for datasets cache
234
+ - `TRANSFORMERS_CACHE`: Directory for transformers cache
235
+ - `GRADIO_SERVER_NAME`: Server host (default: `0.0.0.0`)
236
+ - `GRADIO_SERVER_PORT`: Server port (default: `7860`)
237
+ - `MCP_SERVER_ENABLED`: Enable MCP server (default: `true`)
238
+
239
+ ### Example .env File
240
+
241
+ ```bash
242
+ # HuggingFace Authentication
243
+ HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
244
+
245
+ # Cache Configuration
246
+ HF_HOME=/path/to/cache
247
+ HF_DATASETS_CACHE=/path/to/cache/datasets
248
+ TRANSFORMERS_CACHE=/path/to/cache/transformers
249
+
250
+ # Server Configuration
251
+ GRADIO_SERVER_NAME=0.0.0.0
252
+ GRADIO_SERVER_PORT=7860
253
+ MCP_SERVER_ENABLED=true
254
+ ```
255
+
256
+ ---
257
+
258
+ ## Deployment Options Comparison
259
+
260
+ | Option | Pros | Cons | Best For |
261
+ |--------|------|------|----------|
262
+ | **Local (PDM)** | Fast, easy debugging | Requires Python setup | Development |
263
+ | **Docker** | Isolated, reproducible | Requires Docker | Production, CI/CD |
264
+ | **HF Spaces** | Hosted, no maintenance | Limited control | Public sharing |
265
+
266
+ ---
267
+
268
+ ## Troubleshooting
269
+
270
+ ### Connection Issues
271
+
272
+ 1. **Server not starting**: Check logs for errors, verify dependencies installed
273
+ 2. **Authentication failed**: Verify `HF_TOKEN` is set correctly
274
+ 3. **Port already in use**: Change `GRADIO_SERVER_PORT` to a different port
275
+
276
+ ### Tool Execution Issues
277
+
278
+ 1. **Dataset not found**: Verify dataset ID is correct on HuggingFace Hub
279
+ 2. **Permission denied**: Ensure `HF_TOKEN` has access to private datasets
280
+ 3. **Timeout errors**: Increase timeout settings or use smaller sample sizes
281
+
282
+ ### Docker Issues
283
+
284
+ 1. **Image build fails**: Ensure all dependencies in `pyproject.toml` are compatible
285
+ 2. **Container exits immediately**: Check logs with `docker logs hf-eda-mcp-server`
286
+ 3. **Cache not persisting**: Verify volume mounts in `docker-compose.yml`
287
+
288
+ ---
289
+
290
+ ## Additional Resources
291
+
292
+ - [MCP Protocol Documentation](https://modelcontextprotocol.io/)
293
+ - [Gradio MCP Integration](https://www.gradio.app/guides/gradio-and-mcp)
294
+ - [HuggingFace Hub Documentation](https://huggingface.co/docs/hub/index)
295
+ - [Project Repository](https://github.com/your-username/hf-eda-mcp)
requirements.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ # HuggingFace Spaces requirements
2
+ # Generated from pyproject.toml for Spaces deployment
3
+
4
+ gradio[mcp]>=5.49.1
5
+ datasets>=4.3.0
6
+ huggingface_hub>=0.20.0
7
+ pydantic>=2.0.0
8
+ pandas>=2.0.0
9
+ numpy>=1.24.0
src/hf_eda_mcp/services/dataset_service.py CHANGED
@@ -15,8 +15,7 @@ from datasets import load_dataset
15
  from datasets.utils.logging import disable_progress_bar
16
 
17
  from hf_eda_mcp.integrations.hf_client import (
18
- HfClient,
19
- HfClientError,
20
  DatasetNotFoundError,
21
  AuthenticationError,
22
  NetworkError
@@ -25,7 +24,6 @@ from hf_eda_mcp.error_handling import (
25
  retry_with_backoff,
26
  RetryConfig,
27
  log_error_with_context,
28
- format_error_response
29
  )
30
 
31
  logger = logging.getLogger(__name__)
 
15
  from datasets.utils.logging import disable_progress_bar
16
 
17
  from hf_eda_mcp.integrations.hf_client import (
18
+ HfClient,
 
19
  DatasetNotFoundError,
20
  AuthenticationError,
21
  NetworkError
 
24
  retry_with_backoff,
25
  RetryConfig,
26
  log_error_with_context,
 
27
  )
28
 
29
  logger = logging.getLogger(__name__)