2020-04-13 09:42:06.769691: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 4. Tune using inter_op_parallelism_threads for best performance.
b'Hello, TensorFlow!'
error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/
----------------------------------------
ERROR: Failed building wheel for webrtcvad
Running setup.py clean for webrtcvad
Downloaded and installed, (make sure to install optional CLI)
Rebooted machine
>pip install webrtcvad
Successfully installed webrtcvad-2.0.10
>conda install -c conda-forge librosa
matplotlib already installed
numpy already installed
scipy already installed
>conda install tqdm
>conda install -c conda-forge python-sounddevice
>conda install unidecode
>conda install inflect
pyqt already installed
pyqt 5.9.2 is not the same as PyQT5 (see below).
>pip install PyQT5
>conda install -c conda-forge multiprocess
>conda install numba
>conda install -c conda-forge visdom
Try Demo
Launch PyCharm, NewProject, select Real Time Voice Cloning master
Acknowledge pop-up dialog box asking to create project from existing sources
Use conda env: rtvc
From CorentinJ Read Me
Preliminary
Before you download any dataset, you can begin by testing your configuration with: python demo_cli.py
If all tests pass, you're good to go.
Launch PyCharm, run demo_cli.py
Results (failure: CPU-only not supported) demo_cli.py
OMP: Info #212: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #210: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: 0-3
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 4 cores/pkg x 1 threads/core (4 total cores)
OMP: Info #214: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 3
OMP: Info #250: KMP_AFFINITY: pid 2444 tid 4876 thread 0 bound to OS proc set 0
Arguments:
enc_model_fpath: encoder\saved_models\pretrained.pt
syn_model_dir: synthesizer\saved_models\logs-pretrained
voc_model_fpath: vocoder\saved_models\pretrained\pretrained.pt
low_mem: False
no_sound: False
Your PyTorch installation is not configured to use CUDA. If you have a GPU ready for deep learning, ensure that the drivers are properly installed, and that your CUDA version matches your PyTorch installation. CPU-only inference is currently not supported.
Running a test of your configuration...
Process finished with exit code -1
Performed same test on Ubuntu 18.04 installation (VMWare). Same error.
OMP: Info #212: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #210: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: 0-3
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 4 cores/pkg x 1 threads/core (4 total cores)
OMP: Info #214: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 3
OMP: Info #250: KMP_AFFINITY: pid 944 tid 8284 thread 0 bound to OS proc set 0
Arguments:
enc_model_fpath: encoder\saved_models\pretrained.pt
syn_model_dir: synthesizer\saved_models\logs-pretrained
voc_model_fpath: vocoder\saved_models\pretrained\pretrained.pt
low_mem: False
no_sound: False
Running a test of your configuration...
Preparing the encoder, the synthesizer and the vocoder...
Your PyTorch installation is not configured to use CUDA. If you have a GPU ready for deep learning, ensure that the drivers are properly installed, and that your CUDA version matches your PyTorch installation. CPU-only inference is currently not supported.
Traceback (most recent call last):
File "D:/Projects/Voice Cloning/SV2TTS - shawwn/Real-Time-Voice-Cloning-master (python)/demo_cli.py", line 61, in <module>
encoder.load_model(args.enc_model_fpath)
File "D:\Projects\Voice Cloning\SV2TTS - shawwn\Real-Time-Voice-Cloning-master (python)\encoder\inference.py", line 33, in load_model
checkpoint = torch.load(weights_fpath, map_location=_device)
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\torch\serialization.py", line 525, in load
with _open_file_like(f, 'rb') as opened_file:
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\torch\serialization.py", line 212, in _open_file_like
return _open_file(name_or_buffer, mode)
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\torch\serialization.py", line 193, in __init__
super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'encoder\\saved_models\\pretrained.pt'
Pretrained models come as an archive that contains all three models (speaker encoder, synthesizer, vocoder). The archive comes with the same directory structure as the repo, and you're expected to merge its contents with the root of the repository. For reference, the GPUs used for training are GTX 1080 Ti.
Encoder: trained 1.56M steps (20 days with a single GPU) with a batch size of 64 Synthesizer: trained 256k steps (1 week with 4 GPUs) with a batch size of 144 Vocoder: trained 428k steps (4 days with a single GPU) with a batch size of 100
OMP: Info #212: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #210: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: 0-3
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 4 cores/pkg x 1 threads/core (4 total cores)
OMP: Info #214: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 3
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 5892 thread 0 bound to OS proc set 0
Arguments:
enc_model_fpath: encoder\saved_models\pretrained.pt
syn_model_dir: synthesizer\saved_models\logs-pretrained
voc_model_fpath: vocoder\saved_models\pretrained\pretrained.pt
low_mem: False
no_sound: False
Your PyTorch installation is not configured to use CUDA. If you have a GPU ready for deep learning, ensure that the drivers are properly installed, and that your CUDA version matches your PyTorch installation. CPU-only inference is currently not supported.
Running a test of your configuration...
Preparing the encoder, the synthesizer and the vocoder...
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 424 thread 1 bound to OS proc set 1
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 1888 thread 2 bound to OS proc set 2
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 9136 thread 3 bound to OS proc set 3
Loaded encoder "pretrained.pt" trained to step 1564501
Found synthesizer "pretrained" trained to step 278000
Building Wave-RNN
Trainable Parameters: 4.481M
Loading model weights at vocoder\saved_models\pretrained\pretrained.pt
Testing your configuration with small inputs.
Testing the encoder...
WARNING:tensorflow:From D:\Projects\Voice Cloning\SV2TTS - shawwn\Real-Time-Voice-Cloning-master (python)\synthesizer\inference.py:57: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.
2020-04-13 11:12:21.391252: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 4. Tune using inter_op_parallelism_threads for best performance.
WARNING:tensorflow:From D:\Projects\Voice Cloning\SV2TTS - shawwn\Real-Time-Voice-Cloning-master (python)\synthesizer\tacotron2.py:62: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
WARNING:tensorflow:From C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\tensorflow\python\training\saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 1432 thread 4 bound to OS proc set 0
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 756 thread 5 bound to OS proc set 1
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 2012 thread 6 bound to OS proc set 2
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 7152 thread 7 bound to OS proc set 3
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 7340 thread 8 bound to OS proc set 0
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 7184 thread 9 bound to OS proc set 1
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 4888 thread 10 bound to OS proc set 2
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 3912 thread 11 bound to OS proc set 3
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 4496 thread 12 bound to OS proc set 0
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 8696 thread 13 bound to OS proc set 1
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 124 thread 14 bound to OS proc set 2
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 4064 thread 15 bound to OS proc set 3
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 2920 thread 16 bound to OS proc set 0
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 5044 thread 17 bound to OS proc set 1
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 3748 thread 19 bound to OS proc set 3
OMP: Info #250: KMP_AFFINITY: pid 8996 tid 4052 thread 18 bound to OS proc set 2
Testing the vocoder...
All test passed! You can now synthesize speech.
This is a GUI-less example of interface to SV2TTS. The purpose of this script is to show how you can interface this project easily with your own. See the source code for an explanation of what is happening.
Interactive generation loop
Reference voice: enter an audio filepath of a voice to be cloned (mp3, wav, m4a, flac, ...):
When pointing to my mp3 file, error: Caught exception: NoBackendError()
Appears to be related to librosa missing the OGG codec
test_librosa.py
import librosa
y, sr = librosa.load(librosa.util.example_audio_file())
No Backend Error for librosa
Traceback (most recent call last):
File "D:/Projects/Voice Cloning/TestLibRosa (python)/test_librosa.py", line 3, in <module>
y, sr = librosa.load(librosa.util.example_audio_file())
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\librosa\core\audio.py", line 119, in load
with audioread.audio_open(os.path.realpath(path)) as input_file:
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\audioread\__init__.py", line 116, in audio_open
raise NoBackendError()
audioread.exceptions.NoBackendError
This is a GUI-less example of interface to SV2TTS. The purpose of this script is to show how you can interface this project easily with your own. See the source code for an explanation of what is happening.
Interactive generation loop
Reference voice: enter an audio filepath of a voice to be cloned (mp3, wav, m4a, flac, ...):
..\..\voicefiles\ADH_sample.mp3
Loaded file succesfully
Caught exception: RuntimeError('[enforce fail at ..\\c10\\core\\CPUAllocator.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 1005568 bytes. Buy new RAM!\n')
Restarting
Current memory: 6.1 GB with no memory disk cache
Changed to 12.1 GB with no memory disk cache, 9.4 GB available with PyCharm launched.
Results (success) demo_cli.py
This is a GUI-less example of interface to SV2TTS. The purpose of this script is to show how you can interface this project easily with your own. See the source code for an explanation of what is happening.
Interactive generation loop
Reference voice: enter an audio filepath of a voice to be cloned (mp3, wav, m4a, flac, ...):
..\..\voicefiles\ADH_sample.mp3
Loaded file succesfully
Created the embedding
Write a sentence (+-20 words) to be synthesized:
Hi there, this is Austin
Created the mel spectrogram
Synthesizing the waveform:
{| ████████████████ 76000/76800 | Batch Size: 8 | Gen Rate: 0.9kHz | }float64
Saved output as demo_output_00.wav
Run: demo_toolbox.py
Results (failure: PyQT5 Error) demo_toolbox.py
Traceback (most recent call last):
File "D:/Projects/Voice Cloning/SV2TTS - shawwn/Real-Time-Voice-Cloning-master (python)/demo_toolbox.py", line 2, in <module>
from toolbox import Toolbox
File "D:\Projects\Voice Cloning\SV2TTS - shawwn\Real-Time-Voice-Cloning-master (python)\toolbox\__init__.py", line 1, in <module>
from toolbox.ui import UI
File "D:\Projects\Voice Cloning\SV2TTS - shawwn\Real-Time-Voice-Cloning-master (python)\toolbox\ui.py", line 1, in <module>
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\matplotlib\backends\backend_qt5agg.py", line 11, in <module>
from .backend_qt5 import (
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\matplotlib\backends\backend_qt5.py", line 15, in <module>
import matplotlib.backends.qt_editor.figureoptions as figureoptions
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\matplotlib\backends\qt_editor\figureoptions.py", line 12, in <module>
from matplotlib.backends.qt_compat import QtGui
File "C:\python-programs\Anaconda3\envs\rtvc\lib\site-packages\matplotlib\backends\qt_compat.py", line 168, in <module>
raise ImportError("Failed to import any qt binding")
ImportError: Failed to import any qt binding
Process finished with exit code 1
>conda list pyqt*
# Name Version Build Channel
pyqt 5.9.2 py37h6538335_4 conda-forge
Apparently pyqt v5.9.2 is not the same same as PyQT5.
>pip install PyQT5
>conda list pyqt*
# Name Version Build Channel
pyqt 5.9.2 py37h6538335_4 conda-forge
pyqt5 5.14.2 pypi_0 pypi
pyqt5-sip 12.7.2 pypi_0 pypi
Run: demo_toolbox.py
Results (success, but missing dataset) demo_toolbox.py
WARNING:tensorflow:From D:\Projects\Voice Cloning\SV2TTS - shawwn\Real-Time-Voice-Cloning-master (python)\synthesizer\models\modules.py:91: The name tf.nn.rnn_cell.RNNCell is deprecated. Please use tf.compat.v1.nn.rnn_cell.RNNCell instead.
Warning: you did not pass a root directory for datasets as argument.
The recognized datasets are:
LibriSpeech/dev-clean
LibriSpeech/dev-other
LibriSpeech/test-clean
LibriSpeech/test-other
LibriSpeech/train-clean-100
LibriSpeech/train-clean-360
LibriSpeech/train-other-500
LibriTTS/dev-clean
LibriTTS/dev-other
LibriTTS/test-clean
LibriTTS/test-other
LibriTTS/train-clean-100
LibriTTS/train-clean-360
LibriTTS/train-other-500
LJSpeech-1.1
VoxCeleb1/wav
VoxCeleb1/test_wav
VoxCeleb2/dev/aac
VoxCeleb2/test/aac
VCTK-Corpus/wav48
Feel free to add your own. You can still use the toolbox by recording samples yourself.