WORLD (Difference between TANDEM-STRAIGHT and WORLD)

WORLD was proposed to synthesize high-quality speech as natural as the input speech. The purpose of WORLD is reducing the computational cost of TANDEM-STRAIGHT without deterioration. WORLD is superior to TANDEM-STRAIGHT in implementing the real-time singing synthesis, whereas it is inferior to TANDEM-STRAIGHT in manipulating consonant flexibly. Since the concept of WORLD differs from that of TANDEM-STRAIGHT, you should select them based on your purpose.

Note: The latest version 0.2.0 is completely superior to the TANDEM-STRAIGHT.

Speech processing by WORLD

Figure illustrates the speech processing by WORLD. WORLD decomposes input speech into three parameters: Fundamental frequency (F0), spectral envelope and aperiodicity (Note: excitation signal employed in the previous version was destroyed in version 0.2.0). We can manipulate three parameters and generate the speech from them. Three parameters are effective as the parameters to analyse para- and non-linguistic information.