Spaces:
Configuration error
Configuration error
Upload 2 files
Browse files
QA.md
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Common Questions and Answers
|
| 2 |
+
|
| 3 |
+
## General Comments
|
| 4 |
+
|
| 5 |
+
**OpenVoice is a Technology, not a Product**
|
| 6 |
+
|
| 7 |
+
Although it works on a majority of voices if used correctly, please do not expect it to work perfectly on every case, as it takes a lot of engineering effort to translate a technology to a stable product. The targeted users of this technology are developers and researchers, not end users. End users expects a perfect product. However, we are confident to say that OpenVoice is the state-of-the-art among the source-available voice cloning technologies.
|
| 8 |
+
|
| 9 |
+
The contribution of OpenVoice is a versatile instant voice cloning technical approach, not a ready-to-use perfect voice cloning product. However, we firmly believe that by releasing OpenVoice, we can accelerate the open research community's progress on instant voice cloning, and someday in the future the free voice cloning methods will be as good as commercial ones.
|
| 10 |
+
|
| 11 |
+
## Issues with Voice Quality
|
| 12 |
+
|
| 13 |
+
**Accent and Emotion of the Generated Voice is not Similar to the Reference Voice**
|
| 14 |
+
|
| 15 |
+
First of all, OpenVoice only clones the tone color of the reference speaker. It does NOT clone the accent or emotion. The accent and emotion is controlled by the base speaker TTS model, not cloned by the tone color converter (please refer to our [paper](https://arxiv.org/pdf/2312.01479.pdf) for technical details). If the user wants to change the accent or emotion of the output, they need to have a base speaker model with that accent. OpenVoice provides sufficient flexibility for users to integrate their own base speaker model into the framework by simply replacing the current base speaker we provided.
|
| 16 |
+
|
| 17 |
+
**Bad Audio Quality of the Generated Speech**
|
| 18 |
+
|
| 19 |
+
Please check the followings:
|
| 20 |
+
- Is your reference audio is clean enough without any background noise? You can find some high-quality reference speech [here](https://aiartes.com/voiceai)
|
| 21 |
+
- Is your audio too short?
|
| 22 |
+
- Does your audio contain speech from more than one person?
|
| 23 |
+
- Does the reference audio contain long blank sections?
|
| 24 |
+
- Did you name the reference audio the same name you used before but forgot to delete the `processed` folder?
|
| 25 |
+
|
| 26 |
+
## Issues with Languages
|
| 27 |
+
|
| 28 |
+
**Support of Other Languages**
|
| 29 |
+
|
| 30 |
+
For multi-lingual and cross-lingual usage, please refer to [`demo_part2.ipynb`](https://github.com/myshell-ai/OpenVoice/blob/main/demo_part2.ipynb). OpenVoice supports any language as long as you have a base speaker in that language. The OpenVoice team already did the most difficult part (tone color converter training) for you. Base speaker TTS model is relatively easy to train, and multiple existing open-source repositories support it. If you don't want to train by yourself, simply use the OpenAI TTS model as the base speaker.
|
| 31 |
+
|
| 32 |
+
## Issues with Installation
|
| 33 |
+
**Error Related to Silero**
|
| 34 |
+
|
| 35 |
+
When calling `get_vad_segments` from `se_extractor.py`, there should be a message like this:
|
| 36 |
+
```
|
| 37 |
+
Downloading: "https://github.com/snakers4/silero-vad/zipball/master" to /home/user/.cache/torch/hub/master.zip
|
| 38 |
+
```
|
| 39 |
+
The download would fail if your machine can not access github. Please download the zip from "https://github.com/snakers4/silero-vad/zipball/master" manually and unzip it to `/home/user/.cache/torch/hub/snakers4_silero-vad_master`. You can also see [this issue](https://github.com/myshell-ai/OpenVoice/issues/57) for solutions for other versions of silero.
|
USAGE.md
ADDED
|
@@ -0,0 +1,83 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Usage
|
| 2 |
+
|
| 3 |
+
## Table of Content
|
| 4 |
+
|
| 5 |
+
- [Quick Use](#quick-use): directly use OpenVoice without installation.
|
| 6 |
+
- [Linux Install](#linux-install): for researchers and developers only.
|
| 7 |
+
- [V1](#openvoice-v1)
|
| 8 |
+
- [V2](#openvoice-v2)
|
| 9 |
+
- [Install on Other Platforms](#install-on-other-platforms): unofficial installation guide contributed by the community
|
| 10 |
+
|
| 11 |
+
## Quick Use
|
| 12 |
+
|
| 13 |
+
The input speech audio of OpenVoice can be in **Any Language**. OpenVoice can clone the voice in that speech audio, and use the voice to speak in multiple languages. For quick use, we recommend you to try the already deployed services:
|
| 14 |
+
|
| 15 |
+
- [British English](https://app.myshell.ai/widget/vYjqae)
|
| 16 |
+
- [American English](https://app.myshell.ai/widget/nEFFJf)
|
| 17 |
+
- [Indian English](https://app.myshell.ai/widget/V3iYze)
|
| 18 |
+
- [Australian English](https://app.myshell.ai/widget/fM7JVf)
|
| 19 |
+
- [Spanish](https://app.myshell.ai/widget/NNFFVz)
|
| 20 |
+
- [French](https://app.myshell.ai/widget/z2uyUz)
|
| 21 |
+
- [Chinese](https://app.myshell.ai/widget/fU7nUz)
|
| 22 |
+
- [Japanese](https://app.myshell.ai/widget/IfIB3u)
|
| 23 |
+
- [Korean](https://app.myshell.ai/widget/q6ZjIn)
|
| 24 |
+
|
| 25 |
+
## Minimal Demo
|
| 26 |
+
|
| 27 |
+
For users who want to quickly try OpenVoice and do not require high quality or stability, click any of the following links:
|
| 28 |
+
|
| 29 |
+
<div align="center">
|
| 30 |
+
<a href="https://app.myshell.ai/bot/z6Bvua/1702636181"><img src="../resources/myshell-hd.png" height="28"></a>
|
| 31 |
+
|
| 32 |
+
<a href="https://huggingface.co/spaces/myshell-ai/OpenVoice"><img src="../resources/huggingface.png" height="32"></a>
|
| 33 |
+
</div>
|
| 34 |
+
|
| 35 |
+
## Linux Install
|
| 36 |
+
|
| 37 |
+
This section is only for developers and researchers who are familiar with Linux, Python and PyTorch. Clone this repo, and run
|
| 38 |
+
|
| 39 |
+
```
|
| 40 |
+
conda create -n openvoice python=3.9
|
| 41 |
+
conda activate openvoice
|
| 42 |
+
git clone [email protected]:myshell-ai/OpenVoice.git
|
| 43 |
+
cd OpenVoice
|
| 44 |
+
pip install -e .
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
No matter if you are using V1 or V2, the above installation is the same.
|
| 48 |
+
|
| 49 |
+
### OpenVoice V1
|
| 50 |
+
|
| 51 |
+
Download the checkpoint from [here](https://myshell-public-repo-host.s3.amazonaws.com/openvoice/checkpoints_1226.zip) and extract it to the `checkpoints` folder.
|
| 52 |
+
|
| 53 |
+
**1. Flexible Voice Style Control.**
|
| 54 |
+
Please see [`demo_part1.ipynb`](../demo_part1.ipynb) for an example usage of how OpenVoice enables flexible style control over the cloned voice.
|
| 55 |
+
|
| 56 |
+
**2. Cross-Lingual Voice Cloning.**
|
| 57 |
+
Please see [`demo_part2.ipynb`](../demo_part2.ipynb) for an example for languages seen or unseen in the MSML training set.
|
| 58 |
+
|
| 59 |
+
**3. Gradio Demo.**. We provide a minimalist local gradio demo here. We strongly suggest the users to look into `demo_part1.ipynb`, `demo_part2.ipynb` and the [QnA](QA.md) if they run into issues with the gradio demo. Launch a local gradio demo with `python -m openvoice_app --share`.
|
| 60 |
+
|
| 61 |
+
### OpenVoice V2
|
| 62 |
+
|
| 63 |
+
Download the checkpoint from [here](https://myshell-public-repo-host.s3.amazonaws.com/openvoice/checkpoints_v2_0417.zip) and extract it to the `checkpoints_v2` folder.
|
| 64 |
+
|
| 65 |
+
Install [MeloTTS](https://github.com/myshell-ai/MeloTTS):
|
| 66 |
+
```
|
| 67 |
+
pip install git+https://github.com/myshell-ai/MeloTTS.git
|
| 68 |
+
python -m unidic download
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
**Demo Usage.** Please see [`demo_part3.ipynb`](../demo_part3.ipynb) for example usage of OpenVoice V2. Now it natively supports English, Spanish, French, Chinese, Japanese and Korean.
|
| 72 |
+
|
| 73 |
+
|
| 74 |
+
## Install on Other Platforms
|
| 75 |
+
|
| 76 |
+
This section provides the unofficial installation guides by open-source contributors in the community:
|
| 77 |
+
|
| 78 |
+
- Windows
|
| 79 |
+
- [Guide](https://github.com/Alienpups/OpenVoice/blob/main/docs/USAGE_WINDOWS.md) by [@Alienpups](https://github.com/Alienpups)
|
| 80 |
+
- You are welcome to contribute if you have a better installation guide. We will list you here.
|
| 81 |
+
- Docker
|
| 82 |
+
- [Guide](https://github.com/StevenJSCF/OpenVoice/blob/update-docs/docs/DF_USAGE.md) by [@StevenJSCF](https://github.com/StevenJSCF)
|
| 83 |
+
- You are welcome to contribute if you have a better installation guide. We will list you here.
|