Personalized Generated Singing Samples
Target speech : Target speech from the unseen target speakers at run-time
Template singing : Template singing recorded from trained singers
SVG-PL : Conversion-vocoding pipeline with BLSTM conversion model and WaveRNN vocoder
SVG-WORLD : A variant of conversion-vocoding pipeline with BLSTM conversion model and WORLD vocoder
SVG-IN : Proposed conversion-vocoding integrated singing voice generation network using WaveRNN
Female samples: Proposed SVG-IN v.s. SVG-PL
|
Target speech |
Template singing |
SVG-PL |
SVG-IN |
Sample 1 |
|
|
|
|
Sample 2 |
|
|
|
|
Male samples: Proposed SVG-IN v.s. SVG-PL
|
Target speech |
Template singing |
SVG-PL |
SVG-IN |
Sample 1 |
|
|
|
|
Sample 2 |
|
|
|
|
Female samples: Proposed SVG-IN v.s. SVG-WORLD
|
Target speech |
Template singing |
SVG-WORLD |
SVG-IN |
Sample 1 |
|
|
|
|
Sample 2 |
|
|
|
|
Male Samples: Proposed SVG-IN v.s. SVG-WORLD
|
Target speech |
Template singing |
SVG-WORLD |
SVG-IN |
Sample 1 |
|
|
|
|
Sample 2 |
|
|
|
|