The Stack Archive News Article

198 million voter records leaked by analysis firm

Tue 20 Jun 2017

Data breach

Security researchers have discovered a massive data security breach that exposed almost 200 million U.S. voter records on the internet.

UpGuard, the security firm that discovered the breach, noted that this was by far the largest of its kind to date. A terabyte of completely unsecured data was found stored in the public cloud in an Amazon Web Services S3 bucket, accessible to anyone with an internet connection that navigated to the correct subdomain.

The data was deposited by Deep Root Analytics, one of the firms used for data mining by the Republican National Committee during the 2016 U.S. presidential election.

Data compiled by Target Point Consulting, Inc. and Data Trust, both Republican data analytics contractors, was included in the breach.

The exposed data included names, home addresses, date of birth, phone numbers and voter registration for each individual. Ethnicity and religion were also included, although rather than hard data these categories had assumptions, or best guesses, modelled on other personal data.

Critical data was stored in two folders, one called data_trust and the other target_point. Together, the two folders contained 1.1 terabytes of information collected from and modelled on American voter records.

The data_trust folder contained demographic information about each voter, both hard data and algorithmically calculated ‘best guesses’ in critical categories. The target_point folder, though, consisted of 14 separate files, each created for large-scale data analysis, with millions of rows of microdata created for all the voters in the system. One file, for example, listed each potential voter individually and rated their likelihood to support a policy, candidate, or issue at the top of the column. These categories were as diverse as ‘how likely it is the individual voted for Obama in 2012, to whether they agree with the Trump foreign policy of ‘America First’, to how likely they are to be concerned with auto manufacturing as an issue, among others.’

The target_point files identified each voter by a 32-character RNC ID that was contained in the data_trust file, making it simple to tie all data back to the individual.

Deep Root, TargetPoint and Data Trust together accumulated 9.5 billion data points on American voters, using algorithmic modeling to compile information in 48 different categories on each of the 198 potential voters in the presidential election.

UpGuard researcher Chris Vickery discovered the unsecured data and notified federal authorities, and the bucket was secured shortly thereafter.


Amazon analytics Big Data data government news research security U.S.
Send us a correction about this article Send us a news tip