Symposium: the evolution of vocal and facial expressions

At the occasion of the PhD defense of Pablo Arias on Dec. 18th, the CREAM lab is happy to organize a mini-symposium on recent results on the evolution and universality of vocal and facial expressions with two prominent researchers from the field, Dr. Rachael Jack (School of Psychology, University of Glasgow) and Prof. Tecumseh Fitch (Department of Cognitive Biology, University of Vienna). The two talks will be followed in the afternoon by the PhD viva of Pablo Arias on “auditory smiles”, which is also public.

Date: Tuesday December 18th

Hours: 10h30-12h (symposium), 14h (PhD. viva)

Place: Salle Stravinsky, Institut de Recherche et Coordination en Acoustique/Musique (IRCAM), 1 Place Stravinsky 75004 Paris. [access]

Tuesday Dec. 18th, 10h30-12h

Symposium: The evolution of facial and vocal expressions (Dr. Rachael Jack, Prof. Tecumseh Fitch)

10h30-11H15 – Dr. Rachael Jack (University of Glasgow, UK)

Modelling Dynamic Facial Expressions Across Cultures

Facial expressions are one of the most powerful tools for human social communication. However, understanding facial expression communication is challenging due to their sheer number and complexity. Here, I present a program of work designed to address this challenge using a combination of social and cultural psychology, vision science, data-driven psychophysical methods, mathematical psychology, and 3D dynamic computer graphics. Across several studies, I will present work that precisely characterizes how facial expressions of emotion are signaled and decoded within and across cultures, and shows that cross-cultural emotion communication comprises four, not six, main categories. I will also highlight how this work has the potential to inform the design of socially and culturally intelligent robots.

11h15-12h – Prof. Tecumseh Fitch (University of Vienna, Austria)

The evolution of voice formant perception

Abstract t.b.a.

Tuesday Dec. 18th, 14h-16h30

PhD Defense: Auditory smiles (Pablo Arias)

At 14h on the same day, Pablo Arias (PhD candidate, Sorbonne-Université) will defend his PhD thesis, conducted in the CREAM Lab/ Perception and Sound Design Team (STMS – IRCAM/CNRS/Sorbonne Université). The viva is public, and all are welcome.

14h-16h30 – M. Pablo Arias (IRCAM, CNRS, Sorbonne Université)

The cognition of auditory smiles: a computational approach

Emotions are the fuel of human survival and social development. Not only do we undergo primitive reflexes mediated by ancient brain structures, but we also consciously and unconsciously regulate our emotions in social contexts, affiliating with friends and distancing from foes. One of our main tools for emotion regulation is facial expression and, in particular, smiles. Smiles are deeply grounded in human behavior: they develop early, and are used across cultures to communicate affective states. The mechanisms that underlie their cognitive processing include interactions not only with visual, but also emotional and motor systems. Smiles, trigger facial imitation in their observers, reactions thought to be a key component of the human capacity for empathy. Smiles, however, are not only experienced visually, but also have audible consequences. Although visual smiles have been widely studied, almost nothing is known about the cognitive processing of their auditory counterpart. 

This is the aim of this dissertation. In this work, we characterise and model the smile acoustic fingerprint, and use it to probe how auditory smiles are processed cognitively. We give here evidence that (1) auditory smiles can trigger unconscious facial imitation, that (2) they are cognitively integrated with their visual counterparts during perception, and that (3) the development of these processes does not depend on pre-learned visual associations. We conclude that the embodied mechanisms associated to the visual processing of facial expressions of emotions are in fact equally found in the auditory modality, and that their cognitive development is at least partially independent from visual experience.

Download link: Thesis manuscript

Thesis Committee:

  • Prof. Tecumseh Fitch – Reviewer – Department of Cognitive Biology, University of Vienna
  • Dr. Rachael Jack – Reviewer – School of Psychology, University of Glasgow
  • Prof. Julie Grèzes – Examiner – Département d’Etudes Cognitives, Ecole Normale Supérieure, Paris
  • Prof. Catherine Pelachaud – Examiner – Institut des Systèmes Intelligents et de Robotique, Sorbonne Université/CNRS, Paris.
  • Prof. Martine Gavaret – Examiner – Service de Neurophysiologie, Groupement Hospitalier Saint-Anne, Paris.
  • Dr. Patrick Susini – Thesis Director – STMS, IRCAM/CNRS/Sorbonne Université, Paris
  • Dr. Pascal Belin – Thesis Co-director – Institut des Neurosciences de la Timone, Aix-Marseille Université.
  • Dr. Jean-Julien Aucouturier – Thesis Co-director – STMS, Ircam/CNRS/Sorbonne Université, Paris

Read More

Symposium: Recent voice research from the Netherlands

At the occasion of the PhD defense of Laura Rachman on Dec. 7th, the CREAM lab is happy to organize a mini-symposium on recent voice affective science and neuroscience with two prominent researchers from the Netherlands, Prof. Disa Sauter (Department of Social Psychology, Universiteit van Amsterdam) and Prof. Sonja Kotz (Department of Neuropsychology & Psychopharmacology, Maastricht University). The two talks will be followed in the afternoon by the PhD viva of Laura Rachman, which is also public.

Date: Friday December 7th

Hours: 10h30-12h (symposium), 14h (PhD. viva)

Place: Salle Stravinsky, Institut de Recherche et Coordination en Acoustique/Musique (IRCAM), 1 Place Stravinsky 75004 Paris. [access]


Friday Dec. 7th, 10h30-12h

Symposium: Recent voice research from the Netherlands (Prof. Disa Sauter, Prof. Sonja Kotz)


10h30-11H15 – Prof. Disa Sauter (Universiteit van Amsterdam, NL)

Preparedness for emotions: Evidence for discrete negative and positive emotions from vocal signals

We all have emotions, but where do they come from? Functional accounts of emotion propose that emotions are adaptations which have evolved to help us deal with recurring challenges and opportunities. In this talk, I will present evidence of preparedness from studies of emotional vocalisations like laughs, screams, and sighs. This work suggests that a number of negative and positive emotional states are associated with discrete, innate, and universal vocal signals.


11h15-12h – Prof. Sonja Kotz (Maastricht University, NL)

Prediction in voice and speech

Prediction in voice and speech processing is determined by “when” an event is likely to occur (regularity), and “what” type of event can be expected at a given point in time (order). In line with these assumptions, I will present a cortico-subcortical model that involves the division of labor between the cerebellum and the basal ganglia in the predictive tracing of acoustic events. I will discuss recent human electrophysiological and fMRI data in line with this model.


Friday Dec. 7th, 14h-16h30

PhD Defense: The “other-voice” effect (Laura Rachman)

At 14h on the same day, Laura Rachman (PhD candidate, Sorbonne-Université) will defend her PhD thesis, conducted in the CREAM Lab/ Perception and Sound Design Team (STMS – IRCAM/CNRS/Sorbonne Université). The viva is public, and all are welcome.


14h-16h30 – Ms. Laura Rachman (IRCAM, CNRS, Sorbonne Université)

The “other-voice” effect: how speaker identity and language familiarity influence the way we process emotional speech

The human voice is a powerful tool to convey emotions. Humans hear voices on a daily basis and are able to rapidly extract relevant information to successfully interact with others. The theoretical aim of this dissertation is to investigate the role of familiarity on emotional voice processing. A set of behavioral and electrophysiological studies investigated how self- versus non self-produced voices influence the processing of emotional speech utterances. By contrasting self and other, familiarity is here assessed at a personal level. The results of a first set of studies show a dissociation of explicit and implicit processing of the self-voice. While explicit discrimination of an emotional self-voice and other-voice was somewhat impaired, implicit self-processing prompted a self-advantage in emotion recognition and speaker discrimination. The results of a second set of studies show a prioritization for the non-self voice in the processing of emotional and low-level acoustic changes, reflected in faster electrophysiological (EEG) and behavioral responses. In a third set of studies, the effect of voice familiarity on emotional voice perception is assessed at a larger sociocultural scale by comparing speech utterances in the native and a foreign language. Taken together, this disseration highlights some ways in which the ‘otherness’ of a voice – whether a non-self speaker or a foreign language speaker – is processed with a higher priority on the one hand, but with less acoustic precision on the other hand.

Download link: Thesis manuscript

Thesis Committee:

  • Prof. Sonja Kotz  – Reviewer – Department of Neuropsychology and Psychopharmacology, Maastricht University
  • Prof. Pascal Belin – Reviewer – Institut de Neurosciences de la Timone, CNRS, Aix-Marseille Université
  • Prof. Disa Sauter – Examiner – Department of Social Psychology, Universiteit van Amsterdam
  • Dr. Marie Gomot – Examiner – Centre de Pédopsychiatrie, INSERM, Université de Tours
  • Prof. Mohamed Chetouani – Examiner – Institut des Systèmes Intelligents et de Robotique, Sorbonne Université
  • Dr. Stéphanie Dubal – Thesis Co-director – Institut du Cerveau et de la Moelle épinière, CNRS, Sorbonne Université
  • Dr. Jean-Julien Aucouturier – Thesis Co-director – STMS – Ircam/CNRS/Sorbonne Université

Read More

ANGUS: the Highway to Yell

ANGUS is a real-time voice transformation tool able to simulate cues of arousal/roughness on arbitrary voice signals with a high degree of realism. Vocal roughness is generated by highly unstable modes of vibration in the vocal folds and tract, which result in sub-harmonics and nonlinear components which are not present in standard phonation. We propose to simulate this physiological mechanism using multiple amplitude modulations driven by the fundamental frequency of the incoming sound.


Read More

Cracking the social code of speech prosody

New paper out this month in PNAS, in which we use new audio software (CLEESE) to deploy reverse-correlation in the space of speech prosody, and uncover robust and shared mental representations of trustworthiness and dominance in a speaker’s voice.  The paper is open-access, data and analysis code freely available at and the CLEESE software is open-source and available as a free download here.

Ponsot, E., Burred, JJ., Belin, P. & Aucouturier, JJ. (2018) Cracking the social code of speech prosody using reverse correlation, Proceedings of the National Academy of Sciences. [html] [pdf]



Australia Science Channel:





Read More

(S)CREAM ! An impromptu workshop on screams

The CREAM lab organizes a short, impromptu workshop on the biology, cultural history, musicality and acoustic of !screams!, to be held in IRCAM, Paris, on Thursday 22nd June, 2-5pm. The workshop will consist of four invited talks, followed by a discussion around drinks and cakes.

CREAM organise un petit séminaire impromptu sur la biologie, l’histoire culturelle, la musicalité et l’acoustique des !CRIS!, il aura lieu à l’IRCAM le Jeudi 22 Juin de 14h à 17h. Le séminaire sera constitué de quatre présentations suivies par une discussion autour de quelques boissons et gâteaux.



Date: Thursday 22nd June 2017, 2-5pm

Place: Stravinsky Room, IRCAM, 1 Place Stravinsky, 75004 Paris.

Attendance: free, subjected to seat availability.

Local organizers: Louise Goupil ( , JJ Aucouturier (




Read More

Ministry of Silly Talks: Infinite numbers of prosodic variations with C.L.E.E.S.E.

C.L.E.E.S.E. (Combinatorial Expressive Speech Engine) is a tool designed to generate an infinite number of natural-sounding, expressive variations around an original speech recording. More precisely, C.L.E.E.S.E. creates random fluctuations around the file’s original contour of pitch, loudness, timbre and speed (i.e. roughly defined, its prosody). One of its applications is the generation of very many random voice stimuli for reverse correlation experiments, or whatever else you fancy, really.


Read More

Voice transformation tool DAVID now available on the IRCAM Forum!

Exciting news! As of March 2017, DAVID, our emotional voice transformation tool, is available as a free download on the IRCAM Forum, the online community of all science and art users of audio software developped in IRCAM. This new plateform will provide updates on the latest releases of the software, and better user support. In addition, we’ll demonstrate the software at the IRCAM Forum days in Paris on March 15-17, 2017. Come say hi! (and sound all very realistically happy/sad/afraid) if you’re around.


Read More

Upcoming: Two invited talks on reverse-correlation for high-level auditory cognition

CREAM Lab is hosting a small series of distinguised talks on reverse-correlation this month:

  • Wednesday 22nd March 2017 (11:00) – Prof. Fréderic Gosselin (University of Montreal)
  • Thursday 23rd March 2017 (11:30) – Prof. Peter Neri (Ecole Normale Supérieure, Paris).

These talks are organised in the context of a workshop on reverse-correlation for high-level audio cognition, to be held in IRCAM the same days (on-invitation-only). Both talks are free for all, in IRCAM (1 Place Stravinsky, 75004 Paris). Details (titles, abstract) are below.


Read More