Live CGI pipeline puts words in the mouth of Donald Trump and Arnold Schwarzenegger
Mon 21 Mar 2016
In a presidential campaign year any new technology that lets you put words in your opponent’s mouth is bound to raise an eyebrow, and researchers at Stanford University have captured the imagination of tech news sources today with a CGI processing pipeline that enables the transposition of one person’s facial movements onto another’s – with examples featuring the likes of Donald Trump, Russian premier Vladimir Putin, ex-president George W. Bush and former California governor Arnold Schwarzenegger.
Though touted as ‘live’ – a significant feat – the technique, which is presented in an impressive video (embedded below), involves analysing enough original footage of the subject to generate mesh information to recreate the interior of the subject’s mouth; so presumably a few minutes of initial facial analysis are necessary in order to pre-populate the overlay. The video also provides comparisons with previous work in the field, and notes that the new technique requires only RGB input, rather than RGBD, which has depth information as well:
The transposition of facial expression via CGI overlays is no news in Hollywood, where the technique has been developed and refined over many years, even creating motion-capture ‘A-listers’ such as actor Andy Serkis, who has made a specialised career out of generating facial and body movements for grafting into characters such as Gollum in Peter Jackson’s Lord Of The Rings cycle.
Some of Hollywood’s most ground-breaking work in facial transposition took place under the supervision of VFX lead Ed Ulbrich, for the 2008 David Fincher outing The Curious Case Of Benjamin Button. Actor Brad Pitt was required to generate facial movements that would be superimposed either upon the face of a significantly smaller actor, or onto pure CGI creations, needing to have his face covered in a solution designed to ensure maximum detail of capture:
Despite some Reddit-based hysteria, the Stanford processing technique – dubbed Face2Face – hardly constitutes the ‘atomic bomb of disinformation’, despite the clever choice of test footage subjects; full simulation would require a vocal element, and the current state of known research into authentic voice simulation is larval. Additionally the buffer requirement for gathering information about the mouth interior would indicate that one could hope to trust at least the first few minutes of any video subject to the process. Furthermore the video does not show us the resulting composite footage at HD resolution, so authenticity at up to 4K could be questionable.
But let us suppose that all these factors are overcome; in the field of American politics, the use of Face2Face could hardly lower the current level of trust in politicians anyway. And if the video camera is soon perceived to by lying, it’s only catching up with the still camera in that respect. Perhaps seeing will soon be the beginning of discussion, rather than ‘believing’.