C.L.E.E.S.E. (Combinatorial Expressive Speech Engine) is a tool designed to generate an infinite number of natural-sounding, expressive variations around an original speech recording. More precisely, C.L.E.E.S.E. creates random fluctuations around the file’s original contour of pitch, loudness, timbre and speed (i.e. roughly defined, its prosody). One of its foreseen applications is the generation of very many random voice stimuli for reverse correlation experiments, or whatever else you fancy, really.
C.L.E.E.S.E. was implemented by Juan José Burred and Emmanuel Ponsot (CREAM Lab, IRCAM, Paris), with generous funding from the European Research Council (CREAM #335536, 2014-2019, PI: JJ Aucouturier)
Random pitch variations around the same recording (French sentence: “Je suis en route pour la réunion” – I’m on my way to the meeting).
Same recording, with random speed variations around the original speed contour
Same recording, with random timbre variations (i.e. frequency warping of the spectral envelope)
All this is obviously language-independent. `We’ll stop in a couple of minutes’, in Japanese, with random pitch:
CLEESE is implemented as a free, open-source MATLAB toolbox, and will be released soon. Drop us an email if you want more information (JJ Aucouturier – email@example.com)