Personalized Generated Singing Samples


Target speech : Target speech from the unseen target speakers at run-time

Template singing : Template singing recorded from trained singers

SVG-PL : Conversion-vocoding pipeline with BLSTM conversion model and WaveRNN vocoder

SVG-WORLD : A variant of conversion-vocoding pipeline with BLSTM conversion model and WORLD vocoder

SVG-IN : Proposed conversion-vocoding integrated singing voice generation network using WaveRNN


Female samples: Proposed SVG-IN v.s. SVG-PL

Target speech Template singing SVG-PL SVG-IN
Sample 1
Sample 2

Male samples: Proposed SVG-IN v.s. SVG-PL

Target speech Template singing SVG-PL SVG-IN
Sample 1
Sample 2

Female samples: Proposed SVG-IN v.s. SVG-WORLD

Target speech Template singing SVG-WORLD SVG-IN
Sample 1
Sample 2

Male Samples: Proposed SVG-IN v.s. SVG-WORLD

Target speech Template singing SVG-WORLD SVG-IN
Sample 1
Sample 2