Spaces:

Inherited
/

Tunisian-Speech-rec

Sleeping

App Files Files Community

Tunisian-Speech-rec / README.md

Inherited

Update README.md

26f897f verified 7 months ago

preview code

raw

history blame contribute delete

4.32 kB

metadata

title: Vosk Arabic Speech-to-Text API
emoji: 🗣️
colorFrom: gray
colorTo: green
sdk: docker
app_file: app.py
pinned: false

🧏‍♂️ Arabic Tunisian Speech-to-Text API

This Space hosts a lightweight speech recognition API using the vosk-model-small-ar-tn-0.1-linto, tailored for Tunisian dialect. Upload audio files or send audio input for transcription in real-time using FastAPI.

📦 Features

🗣️ Supports Tunisian dialect (not just MSA)
⚡ Fast, offline, and CPU-friendly
🧠 Uses vosk-model-small-ar-tn-0.1-linto (~40MB)
🔌 REST API endpoint for audio transcription
🧪 Easy to test locally or remotely

🧠 Model Details

Model	Description
`vosk-model-small-ar-tn`	Lightweight Tunisian Arabic model by Linto
Size	~40MB
Type	DeepSpeech-like, optimized for small CPUs
Accuracy	Good for clear speech in Tunisian dialect
Input	16kHz mono `.wav` files
Output	Plain Arabic text (Tunisian dialect)

✅ Ideal for offline applications and edge devices.

🚀 Quick Start (API)

🔧 Endpoint: `POST /transcribe/tunisian`

Send a .wav audio file and receive a transcription in Arabic.

✅ Example CURL:

curl -X POST http://localhost:7860/transcribe/tunisian \
  -F "[email protected]"

📤 Example Response:

{
  "transcript": "شني حوالك اليوم؟"
}

🧪 Local Testing

Clone this repository.
Install dependencies:

pip install -r requirements.txt

Make sure the model is extracted under model/ like this:

model/
└── vosk-model-small-ar-tn-0.1-linto
    ├── am
    ├── conf
    └── etc.

Run locally:

python app.py

Test the /transcribe/tunisian endpoint with a .wav file.

🐳 Docker for Hugging Face Spaces

If you use a Docker-based Space, here’s the sample Dockerfile:

# Use a minimal base image
FROM python:3.9-slim

# Install unzip
RUN apt-get update && apt-get install -y unzip ffmpeg && rm -rf /var/lib/apt/lists/*

# Create a non-root user for security
RUN useradd -m user
USER user

# Set environment variables
ENV HOME=/home/user \
    PATH=/home/user/.local/bin:$PATH \
    PORT=7860

# Set the working directory
WORKDIR $HOME/app

# Copy requirements and install dependencies
COPY --chown=user requirements.txt ./
RUN pip install --upgrade pip && \
    pip install -r requirements.txt

# Copy application files and the model zip
COPY --chown=user ./ $HOME/app

# Unzip the model file
RUN unzip vosk-model-small-ar-tn-0.1-linto.zip -d model && rm vosk-model-small-ar-tn-0.1-linto.zip

# Expose the correct port for Hugging Face Spaces
EXPOSE 7860

# Run the FastAPI app with uvicorn directly
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

🧾 Example Python Client

import requests

with open("sample.wav", "rb") as audio_file:
    response = requests.post(
        "http://localhost:7860/transcribe/tunisian",
        files={"audio": audio_file}
    )
    print(response.json())

📁 File Structure

.
├── app.py                 # FastAPI app with transcription endpoint
├── model/                 # Contains the Vosk model
├── requirements.txt       # Dependencies (FastAPI, Vosk, etc.)
├── sample.wav             # Example audio file
└── Dockerfile             # For deployment

🛠 Dependencies

fastapi
uvicorn
vosk
soundfile
numpy

👩‍💻 Maintainer

Inherited Games Studio 📧 [email protected] 🔗 github.com/inheritedgames

📄 License

MIT License

🧠 Credits

Model: vosk-model-small-ar-tn-0.1-linto
Framework: FastAPI
Hosting: Hugging Face Spaces