FlashSR / README.md

YatharthS

Update README.md

b6d4337 verified 1 day ago

preview code

raw

history blame contribute delete

1.12 kB

metadata

license: apache-2.0
pipeline_tag: audio-to-audio
tags:
  - pytorch
  - audio
  - upsampling

FlashSR

FlashSR is a 2MB audio super-resolution model based on the HierSpeech++'s upsampler architecture. It upscales 16kHz audio to 48kHz at speeds ranging from 200x to 400x real-time.

Details

Model Size: 2MB
Input Rate: 16kHz
Output Rate: 48kHz
Inference Speed: 200x - 400x real-time depending on gpu and dtype

Performance Summary

FlashSR is designed for high-speed frequency reconstruction. It offers a significantly lower computational footprint compared to alternatives such as Resemble-Enhance and ClearerVoice, while maintaining similar output quality.

Benchmark Comparison

Model	Speed	Size
FlashSR	200x - 400x realtime	2MB
Resemble-Enhance	< 20x realtime	~700MB+
ClearerVoice	< 20x realtime	~200MB+

Usage

Usage instructions and source code are available on GitHub: https://github.com/ysharma3501/FlashSR

Credits

Thanks to the authors of HierSpeech++ as this was based on it's 48khz upsampler.