‘Mood mining’: researchers propose app to judge your long-term state of mind from your voice
Mon 21 Mar 2016

A group of Australian researchers is investigating the possible applications of a technology that can gauge a user’s general mood based on the sound of their voice, both in its own right and in relation to other people.
In the paper Context-aware Mood Mining [PDF], the researchers emphasise that mood does not equate to emotion, since emotion is a transient and usually short-lived state, and not likely to be a major indicator over time in fields such as health monitoring or performance assessment.
The ‘context-aware’ aspect of the theory at hand involves the program, powered by Deep Neural Networks, taking into consideration not just an aggregate mean of the user’s tone of voice over varying periods, as measured against a reference index that may not be meaningful in any particular case, but against the user’s tone as compared to that of others with whom they interact:
‘When the user takes part in a phone conversation the system will use the emotional construct of the speech of the person at the other end, the listener, as the “contextual information”. If the listener is talking about an exciting event the user is expected to be excited or cheerful if he/she is in a positive mood, otherwise, it would be assumed that the user is in a negative mood.’
Potential users might be alarmed to consider that the over-caffeinated exuberance of their more excitable friends could end up being considered as a reference point against which their own vocal tone falls short. The paper does not address how to overcome anomalies of this nature, but presumably variables for age and anomalous correspondents will be considered.
The potential of technologies that can gauge long-term states of mind with any measure of accuracy have obvious application in the field of mental health. Prior work [PDF] in this field, led by Cambridge researcher Petko Georgiev, concentrates on individuating audio from ambient surrounding noise, such as car noise, and the Australian paper develops this theme further.
Apps with this kind of ambit inevitably need to run on mobile devices with limited resources, and Georgiev’s proposal leverages an approach based on use of a co-processor with low power requirements. Frugality with resources is going to be an essential component in audio analysis apps, which need to sample sound at an adequate rate to make background noise removal a realistic proposition.
Another paper released this week addresses a secondary issue relevant to using Deep Neural Networks in the mobile space: Compression of Deep Neural Networks on the Fly [PDF], by three researchers from Telecom Bretagne, proposes a new kind of ‘compile time’ compression method that could make DNNs more viable in the low-powered mobile device arena.