--- title: MedASR Medical Speech Recognition emoji: 🏥 colorFrom: blue colorTo: indigo sdk: gradio sdk_version: 6.2.0 app_file: app.py pinned: false license: apache-2.0 --- # 🏥 MedASR - Medical Speech Recognition Demo This is a HuggingFace Space demo for the [MedASR](https://huggingface.co/google/medasr) model. ## About MedASR MedASR is a speech-to-text model based on the Conformer architecture, pre-trained specifically for medical dictation. It is designed to handle: - ✅ Medical terminology - ✅ Radiology dictation - ✅ Physician-patient conversations - ✅ Various medical specialities ## Model Details | Property | Value | |----------|-------| | **Model Type** | Automated Speech Recognition | | **Architecture** | Conformer | | **Parameters** | 105M | | **Input** | Mono-channel audio @ 16kHz | | **Output** | Text | | **License** | Health AI Developer Foundations | ## Usage 1. Click the microphone icon to record audio or upload an audio file 2. Click the "Transcribe" button 3. View the transcribed medical text ## Performance | Dataset | MedASR WER | |---------|-----------| | RAD-DICT | 6.6% | | GENERAL-DICT | 9.3% | | FM-DICT | 8.1% | | MIMIC | 6.6% | ## Citation bibtex @inproceedings{wu2023last, title={Last: Scalable Lattice-Based Speech Modelling in Jax}, author={Wu, Ke and Variani, Ehsan and Bagby, Tom and Riley, Michael}, booktitle={ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages={1–5}, year={2023}, organization={IEEE} } ## Resources - [Model Card](https://huggingface.co/google/medasr) - [GitHub Repository](https://github.com/google-research/medasr) - [Quick Start Notebook](https://github.com/google-research/medasr) - [Fine-tuning Notebook](https://github.com/google-research/medasr) ## License The use of MedASR is governed by the Health AI Developer Foundations terms of use.