IRCAM Neutral-Content Voices

A corpus of multi-linguistic neutral-content sentences spoken in various prosodic tones (6 sentences, 4 languages, 3 male and 3 female speakers, 4 emotional tones), available under Creative Commons for research purpose.



Six of neutral sentences from [Russ, J. B., Gur, R. C., & Bilker, W. B. (2008). Validation of affective and neutral sentence content for prosodic testing. Behavior Research Methods, 40(4), 935–939] were translated in 3 languages (French, Swedish, Japanese) in addition to English.

The different sentences are:

  1. I’m on my way to the meeting (FR: Je suis en route pour la réunion; SE: Jag är på väg till mötet; JP: これから会議なんです)
  2. Can you hear me? (FR: Est-ce que tu m’entends?; JP: 聴こえますか?)
  3. The airplane is almost full (FR: L’avion est presque plein; JP: この便はほとんど満席です)
  4. I would like a new alarm clock (FR: J’aimerais avoir un nouveau réveil; SE: Jag skulle vilja ha en ny väckarklocka; JP: 新しい目覚まし時計が欲しいです)
  5. We’ll stop in a couple of minutes (FR: Nous nous arrêterons dans quelques instants; SE: Vi kommer att stanna om några minuter; JP: もうちょっとしたら休憩しましょう)
  6. Don’t forget a jacket (FR: N’oublie pas de prendre une veste; SE: Glöm inte att ta med en jacka; JP: 上着を忘れないようにしてください)

(Note that only four sentences – 1,4,5 & 6 – were recorded in Swedish).


Six male and six female native speakers (non-actors, M= 22yo) were recorded in each language. English speakers were American English.


All recordings were made in a IAC audiometric booth, using an AKG Perception 420 condenser microphone and a pop filter. Files were bounced on 24 bit depth at 44100 Hz in .aiff.

Here’s a teaser of the French recordings:


Each of the sentences were recorded by each speaker in 6 prosodic variants: three neutral, one sad, one happy and one afraid. These expressions were not validated experimentally, i.e. we provide no guarantee than the neutral variants are actually emotionally neutral, nor that the emotional variants would be consensually recognized as any of the intended emotions by third-party listeners. These were only collected as a way to gather prosodic variety and a sample of natural-sounding emotions. The only think that is validated experimentally here is the neutral nature of the semantics of the original English sentences – per Russ, Gur & Bilker (2008).


The corpus is available as a free download under a Creative Commons licence Attribution-Noncommercial-No Derivative Works 3.0 as a .zip or torrent file from


How to cite

Arias, Pablo and Rachman, Laura and Lind, Andreas and Aucouturier, JJ. (2015). IRCAM Neutral-content Voices (audio corpus). Available online: