Spaces:
Sleeping
Sleeping
metadata
title: Vosk Arabic Speech-to-Text API
emoji: π£οΈ
colorFrom: gray
colorTo: green
sdk: docker
app_file: app.py
pinned: false
π§ββοΈ Arabic Tunisian Speech-to-Text API
This Space hosts a lightweight speech recognition API using the vosk-model-small-ar-tn-0.1-linto, tailored for Tunisian dialect. Upload audio files or send audio input for transcription in real-time using FastAPI.
π¦ Features
- π£οΈ Supports Tunisian dialect (not just MSA)
- β‘ Fast, offline, and CPU-friendly
- π§ Uses
vosk-model-small-ar-tn-0.1-linto(~40MB) - π REST API endpoint for audio transcription
- π§ͺ Easy to test locally or remotely
π§ Model Details
| Model | Description |
|---|---|
vosk-model-small-ar-tn |
Lightweight Tunisian Arabic model by Linto |
| Size | ~40MB |
| Type | DeepSpeech-like, optimized for small CPUs |
| Accuracy | Good for clear speech in Tunisian dialect |
| Input | 16kHz mono .wav files |
| Output | Plain Arabic text (Tunisian dialect) |
β Ideal for offline applications and edge devices.
π Quick Start (API)
π§ Endpoint: POST /transcribe/tunisian
Send a .wav audio file and receive a transcription in Arabic.
β Example CURL:
curl -X POST http://localhost:7860/transcribe/tunisian \
-F "[email protected]"
π€ Example Response:
{
"transcript": "Ψ΄ΩΩ ΨΩΨ§ΩΩ Ψ§ΩΩΩΩ
Ψ"
}
π§ͺ Local Testing
- Clone this repository.
- Install dependencies:
pip install -r requirements.txt
- Make sure the model is extracted under
model/like this:
model/
βββ vosk-model-small-ar-tn-0.1-linto
βββ am
βββ conf
βββ etc.
- Run locally:
python app.py
- Test the
/transcribe/tunisianendpoint with a.wavfile.
π³ Docker for Hugging Face Spaces
If you use a Docker-based Space, hereβs the sample Dockerfile:
# Use a minimal base image
FROM python:3.9-slim
# Install unzip
RUN apt-get update && apt-get install -y unzip ffmpeg && rm -rf /var/lib/apt/lists/*
# Create a non-root user for security
RUN useradd -m user
USER user
# Set environment variables
ENV HOME=/home/user \
PATH=/home/user/.local/bin:$PATH \
PORT=7860
# Set the working directory
WORKDIR $HOME/app
# Copy requirements and install dependencies
COPY --chown=user requirements.txt ./
RUN pip install --upgrade pip && \
pip install -r requirements.txt
# Copy application files and the model zip
COPY --chown=user ./ $HOME/app
# Unzip the model file
RUN unzip vosk-model-small-ar-tn-0.1-linto.zip -d model && rm vosk-model-small-ar-tn-0.1-linto.zip
# Expose the correct port for Hugging Face Spaces
EXPOSE 7860
# Run the FastAPI app with uvicorn directly
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
π§Ύ Example Python Client
import requests
with open("sample.wav", "rb") as audio_file:
response = requests.post(
"http://localhost:7860/transcribe/tunisian",
files={"audio": audio_file}
)
print(response.json())
π File Structure
.
βββ app.py # FastAPI app with transcription endpoint
βββ model/ # Contains the Vosk model
βββ requirements.txt # Dependencies (FastAPI, Vosk, etc.)
βββ sample.wav # Example audio file
βββ Dockerfile # For deployment
π Dependencies
fastapi
uvicorn
vosk
soundfile
numpy
π©βπ» Maintainer
Inherited Games Studio π§ [email protected] π github.com/inheritedgames
π License
MIT License
π§ Credits
- Model:
vosk-model-small-ar-tn-0.1-linto - Framework: FastAPI
- Hosting: Hugging Face Spaces