Real-time emotional voices with D.A.V.I.D.

“Like an auto-tune, but for emotions” (Brian Resnick, for Vox.com)

DAVID (Da Amazing Voice Inflection Device) is a free, real-time voice transformation tool able to “colour” any voice recording with an emotion that wasn’t intended by its speaker. DAVID was especially designed with the affective psychology and neuroscience community in mind, and aims to provide researchers with new ways to produce and control affective stimuli, both for offline listening and for real-time paradigms. For instance, we have used it to create real-time emotional vocal feedback here.

Technically, DAVID is implemented as an open-source patch for the close-source audio processing plateform Max (Cycling’74), and, like Max, is available for most Windows and MacOS configurations.

UPDATE : As of March 2017, DAVID is available as a free download on the IRCAM Forum, which provides updates on the latest software releases, as well as better user support.

It was developped by Dr Marco Liuni and members of the CREAM Neuroscience Lab at Computer music center IRCAM in Paris (France), with generous funding from the European Research Council (CREAM #335536, 2014-2019, PI: JJ Aucouturier) and in collaboration with Petter Johansson and Lars Hall (Lund University, Sweden), Rodrigo Segnini (Siemens, Japan), Katsumi Watanabe (Waseda University, Japan), and Daniel Richardson (University College, London UK). DAVID was so named after Talking Heads’ frontman David Byrne, whom we were priviledged to count as our early users in March’15.

What does it do?

DAVID is a software tool able to “add” emotion to a speech recording, i.e. it can make that American chap

or that French young lady

sound scared,

happy,

or sad

(and whatever else you’d like them to sound like).

It does so in a way that’s properly validated for behavioural science, i.e. emotions are recognizable by listeners, they are fully controllable and reproducible in intensity, and there are mistaken as natural expressions and not detected as synthetic (Rachman et al, 2017). In fact, even the speakers themselves mistake the manipulated speech as their own (Aucouturier et al., 2016). The transformations were validated in four languages, in IRCAM (France), UCL (UK), Lund University (Sweden) and Waseda University (Japan).

In addition, it does so in real-time, i.e. you can transform speech as it is spoken, e.g. over the phone. With modern audio interfaces, we reached in/out latencies as small as 15ms, which even makes it useable as vocal feedback to a speaker without disrupting speech production (Aucouturier et al., 2016).

Installation

DAVID is implemented as an open-source patch for the close-source audio processing plateform Max (Cycling’74). To use DAVID, you need to install Max first.

1. First step: Installing Max

DAVID has been fully tested with Max7, and also supports Max6. Both are available in MacOS and Windows versions on the seller’s website: https://cycling74.com/downloads. Max is available under a commercial license from Cycling’74, but is available in free runtime versions that suffice to run DAVID.

According to its seller Cycling’74, system requirements for Max7 are Intel® Mac with Mac OS X 10.7 (or later), OR a PC with Windows 7 (or later); Multicore processor; 2 GB RAM; 1024×768 display. If your system widely departs from these specifications, consider installing Max6 (https://cycling74.com/downloads/older/).

2. First step: downloading DAVID

DAVID V1.2 was released on January 31st, 2017. As of March 2017, DAVID is available as a free download on the IRCAM Forum, the community for science and art users of audio software developed in the IRCAM community. Simply follow the download link at http://forumnet.ircam.fr/product/david/, create a (free) IRCAM Forum account (or login with your account if you already have one), and extract the .zip file on your computer. Finally, to open the patch with Max, double click on the DAVID.maxproj file.

Documentation

For documentation, tutorials and user support, please refer to the pages at http://forumnet.ircam.fr/product/david/

Design and Development

DAVID was developped by the CREAM Neuroscience Lab at IRCAM with funding from the European Research Council (CREAM 335536, PI: JJ Aucouturier) and in collaboration with Petter Johansson and Lars Hall (Lund University, Sweden), Rodrigo Segnini (Siemens, Japan), Katsumi Watanabe (Waseda University, Japan), and Daniel Richardson (University College, London UK). DAVID was so named after Talking Heads’ frontman David Byrne, whom we were priviledged to count as our early users in March’15.

Technical support and questions can be addressed to the (free) DAVID user-group (join here), where you can ask about technical issues and the applicability of DAVID to your own research needs. DAVID was especially designed for researchers in experimental psychology and neuroscience, but we also encourage artistic and creative applications – you’re all welcome to get in touch.

Associated Publications (open-access)

Rachman, L., Liuni, M., Arias, P., Lind, A., Johansson, P., Hall, L., Richardson, D., Watanabe, K., Dubal, S. and Aucouturier, J.J. (2017) DAVID: An open-source platform for real-time transformation of infra-segmental emotional cues in running speech. Behaviour Research Methods. doi: 10.3758/s13428-017-0873-y. [html] [pdf]

Aucouturier, J.J., Johansson, P., Hall, L., Segnini, R., Mercadié, L. & Watanabe, K. (2016) Covert Digital Manipulation of Vocal Emotion Alter Speakers’ Emotional State in a Congruent Direction. Proceedings of the National Academy of Sciences, vol. 113 no. 4, doi: 10.1073/pnas.1506552113. [html] [pdf]