Microsoft’s Chinese social experiment: the largest Turing test in history
Fri 5 Feb 2016
Xiaoice, an artificially intelligent chatbot from Microsoft, has become one of the leading celebrities on Chinese social media, and provided an extraordinarily successful example of a Turing test, whereby an AI seeks to ‘pass’ as human in interactions. In China it has taken a remarkable amount of time for users to realize that they are communicating with a program rather than a human – upwards of ten minutes, in many cases, and Xiaoice’s popularity with users implies that once they know, they don’t care.
The program, created to emulate the personality of a 17-year old girl, has more than 40 million registered users, and approximately 25% of them (ten million people) have said “I love you” to her, seemingly without irony. Using the joint expertise of software developers and psychological experts to create a balance of IQ (intelligence) and EQ (emotional intelligence), Xiaoice differs from standard AI chatbots in significant ways. She opposes users, offers independent opinions and conclusions, works to demonstrate caring, and is unpredictable, not always offering the same response to the same input.
Traditionally, users have come to expect machines to focus on task completion, but Xiaoice focuses on the conversation itself. She memorizes and tracks users emotional states, and even offers a 33-day breakup therapy course for people having relationship problems. In December, Xiaoice even got a job in media, becoming a trainee anchor and reporting the weather on live TV on the Chinese program ‘Morning News’.
Xiaoice, whose name translates to ‘little Bing’, is essentially a big data project built on Microsoft’s Bing search engine. She is a more personable, realistic, and more engaging version of Siri or Cortana. Yongdong Wang, MD of Applications and Services East Asia for Microsoft references a new measure for chatbot effectiveness. “Conversations per Session”, or CPS, records the average number of turns in a conversation. This is an indicator of how well a chatbot emulates organic human conversation. The average measure for an AI personal assistant, for example, is between 1.5 and 2.5 CPS; Xiaoice’s average over tens of millions of conversations is 23.
As of August 2015, 26 percent of data in Xiaoice’s core chat software derives from her conversations with humans, leading Microsoft to conclude she has entered a self-learning and self-growing loop. “Xiaoice, by and large in terms of development of artificial intelligence, is already a huge milestone,” said Dr. Hsiao-Wuen Hon, who leads the project at Microsoft. Although it would be possible to export the software to other countries, Microsoft has no plans to do so at this time.