Keystroke dynamics could be used to distinguish children from adults online
Tue 2 Feb 2016

Researchers in Turkey are investigating whether children under 15 could be distinguished from older users based on their typing behaviour, positing that this could facilitate safe child-only environments online, as well as helping police to more effectively identify adults seeking to pass themselves off as children in those contexts.
The paper [PDF], by Yasin Uzun, Kemal Bicakci and Yusuf Uzunay, details a research program which sampled keystroke metrics from 100 users, equally divided between male and female and between children and adults, and which was able in initial tests to confirm the identity of participants under the age of 15 with up to 91% accuracy on a relatively small input sampling.
However follow-up tests, wherein adult subjects were asked to type as if they were children, revealed a minimum successful deception rate of 28% across the range of neural network models that were employed to analyse the data (see image below). The researchers suggest that whilst adults have a high success rate in imitating the slower and more hesitant typing of children under 15, analogous data regarding how children use a computer mouse would be likely to make child-imitation significantly more difficult, particularly when combined with linguistic and other anomalies which currently help undercover online police to identify adults trying to contact children online in the guise of being children themselves.
The scientists also suggest that the study paves the way for further age-specific datasets to be gathered around the use of mobile keyboards, as used on smartphones and tablets.
Keystroke dynamics comes into the spotlight more commonly because of its place in biometrics systems designed to authenticate a secure user (of an online bank, for instance), via the way they hesitate or speed up their use of input devices such as a keyboard and mouse.
The group has made its MATLAB test implementations for the study available online, including full source code, a Windows executable and MATLAB scripts.
Successful differentiation of age groups in the study varied across the 13 machine approaches used on the data. Both Support Vector Machine and BFGS quasi-Newton bp reported the least incidence of errors, at 8.8%.
In a caveat which threads many research reports involving neural networks, the researchers indicate that applying AI on a real-time basis using any of the 13 engines which were employed on the data would be resource-intensive, but allow for derived algorithms which would be easier to execute.
The children involved in the study participated with the approval of the Human Subjects Ethics Committee of the Middle East Technical University, and with the agreement of their parents, and were not privy to the research objectives of the trial during participation.