Category Archives: Political World IQ Trainwrecks

Green Card, Red Faces

The United States Government is being sued in a massive class-action suit representing Green Card applicants from over 30 countries which alleges that the United States unfairly denied 22,000 people a Green Card due to a computer blunder.

This story is reported in the Irish Times and the Wall Street Journal.

It is not in the remit of this blog to debate the merits of awarding working visas on the basis of a random lottery, but this is precisely what the Green Card system is, offering places to 50,000 people each year based on a random selection of applications submitted over a 30 day period. According to the WSJ:

In early May, the State Department notified 22,000 people they were chosen. But soon after, it informed them the electronic draw would have to be held again because a computer glitch caused 90% of the winners to be selected from the first two days of applications instead of the entire 30-day registration period.

Many of these 22,000 people are qualified workers who had jobs lined up contingent on their getting the Green Card. The WSJ cites the example of a French neurospyschology PhD holder (who earned her PhD in the US) who had a job offer contingent on her green card.

The root causes that contributed to this problem are:

  1. that the random sampling process did not pull records from the entire 30 day period, with the sampling weighted to the first two days of applicants, with 90% of the “winners” being drawn from the first two days.
  2. There was no review of the sampling process and outputs before the notifications were sent to the applicants and published by the State Department. It appears there was a time lag in the error being identified and the decision being taken to scrap the May Visa Lottery draw.

The first error looks like a possible case of a poorly designed sampling strategy in the software. The regulations governing the lottery draw require that there be a “fair and random sampling” of applicants. As 90% of the applicants were drawn from the first two days, the implication is that the draw was not fair enough or was not random enough. At the risk of sounding a little clinical however, fair and random do not always go hand in hand when it comes to statistical sampling.

If the sampling strategy was to pool all the applications into a single population (N) and then randomly pull 50,000 applicants (sample size n), then all applicants had a statistically equal chance of being selected. The fact that the sampling pulled records from the same date range is an interesting correlation or co-incidence. Indeed, the date of application would be irrelevant to the sampling extraction as everyone would be in one single population. Of course, that depends to a degree on the design of the software that created the underlying data set (were identifiers assigned randomly or sequentially before the selection/sampling process began etc.)

This is more or less how your local State or National lottery works… there is a defined sample of balls pulled randomly which create an identifier which is associated with a ticket you have bought (i.e. the numbers you have picked). You then have a certain statistical chance of a) having your identifier pulled and b) being the only person with that identifier in that draw (or else you have to share the winnings).

If the sampling strategy was to pull a random sample of 1666.6667 records from each of the 30 days that is a different approach. Each person on each day of application has the same chance as anyone else who applied that day, with each day having an equal chance at the same number of applicants being selected. Of course it raises the question of what do you do with the rounding difference you are carrying through the 30 days (equating to 20 people) in order to still be fair and random (a mini-lottery perhaps).

Which raises the question: if the approach was the “random in a given day” sampling strategy why was the software not tested before the draw to ensure that it was working correctly?

In relation to the time lag between publication of the results and the identification of the error, this suggests a broken or missing control process in the validation of the sampling to ensure that it conforms to the expected statistical model. Again, in such a critical process it would not be unreasonable to have extensive checks but the checking should be done BEFORE the results are published.

Given the basis of the Class Action suit, expect to see some statistical debate in the evidence being put forward on both sides.

Electoral finger flub causes kerfuffle

Via the twitters and google comes this story from Oh Canada about the unforeseen confluence of an election, the adoption of new technology (QR codes), and a careless fingerflub that has resulted in a bit of embarassment for a Liberal party candidate.

This is the comedic counterpoint to our story last month of the finger flub that resulted in death and lawyers.

It seems that staffers working for candidate Justin Trudeau fat fingered the creation of the QR code that is being used on his posters. Instead of the code containing a URL for the Liberal Party they hit the “U” key instead, creating a URL that sent people to a “lifestyle” site that promoted the use of lubricants in sexual activity.

Sadly Luberal.ca has been taken down at the request of the party, and it seems that they may be in discussion to buy the domain name from the current owner. The candidate has tweeted about the issue on his twitter feed, and staff have been dispatched out to replace the offending QR code with a corrected version.

All of which adds up to cost and resource headaches for an election candidate who probably had other things planned for his staff to be doing at this stage in the campaign.

Of course, we remain slightly concerned that, given that it is April 1st this may be too good a story to be true. But in that case take it as a parable of what could happen, not necessarily a report of what did!

Duff Data dumps 1 million on the dole (social security)… in France.

The Register carries a story this week that clearly shows  the impact of poor quality information on people, particularly in this time of tightening economic conditions when getting a job is hard enough.

It appears that the French Government’s Police Vetting database is not very complete or accurate. According to the French Data Protection authorities (CNIL), this highlights the

serious issues over the provenance of data illustrate all too clearly what happens when the government starts to collect data on its citizens without putting adequate measures in place for updating and accuracy checking.

It would appear that there are errors in 83% of records, with a range of degrees of seriousness. The errors in the database arise as a result of “the simple mechanism of mis-recording actual verifiable data”. In other words, poorly designed processes,  poorly designed data creation/maintenance processes, poorly trained staff, and a lack of information quality control contribute to the errors.

But what  of the cost to the French economy? Well, every person who has been blacklisted in error by this system is potentially drawing social security payments. On top of that they are not paying taxes into the French economy.

If, in a month, they are drawing €100 in social insurance instead of paying €100 in taxes, the net loss to the French economy is €200 per month.  €200 x 1Million =€200 million per month, or €2,400,000,000 per year.

So, on the basis of a very rough guesstimate, the value to the French economy of fixing these errors is €2.4billion per year. Is that a business case?

Trusted Electoral Information

Introduction

Warning – this is a long and detailed examination of a complicated trainwreck

[Update] The IAIDQ has issued a press release on this topic…Election Throws A Spotlight On Poor Data Quality. [/update]

In every democracy citizens must be able to trust that the State will not impede their right to vote through any act or omission on the part of the State or its agents. Regular visitors to the iqtrainwrecks.info blog will know that Ireland has it’s fair share of problems with its electoral register. Of course, that isn’t news.

However, the Washington Post has reported last weekend (18th October) that the US elections are being plagued by similar issues. The New York Times covers the same ground in this story from 9th October. With a slightly important vote coming up on the 4th of November, that is news

In a saga that has found its way to the US Supreme Court (in at least one case so far), voters are being forced to re-establish their eligibility to vote before the election on November 4th. As the Post points out, “many voters may not know that their names have been flagged” which could “cause added confusion on Election Day”.

So what is going on (apart from the lawyers getting richer of the inevitable law suits and voters finding themselves reduced to just “Rs” as they lose their Vote)? Where is the trust being lost? Why is this an IQ Trainwreck?

A Change of Process and a Migration of Data

Under the Help America Vote Act, responsibility for the management of electoral registers was moved from locally managed (i.e. county level) to state administered. This has been trumpeted as a more efficient and accurate way to manage the accuracy of electoral lists. After all, the states also have the driver licensing data, social welfare data and other data sources to use to validate that a voter is a voter and not a gold fish.

However, where discrepancies arise between the information on the voter registration and other official records, the voter registration is rejected. And as anyone who has dealt with ‘officialdom’ can testify to, very often those errors are outside the control of the ‘data subject’ (in this case the voter). The legislation requires election officials to use the state databases first, with recourse to the Federal databases (such as social security) supposedly reserved as a ‘last resort’ because ,according the the New York Times, “using the federal databases is less reliable than the state lists and is more likely to incorrectly flag applications as invalid”.

Of course, for a comment on the accuracy of state databases I’ve found this story on The Risks Digest which seems to sum things up (however, as a caveat I’ll point out that the story is 10 years old, but my experience is that when crappy data gets into a system it’s hard to get it out). In the linked-to story, the author (living in the US) tells of her experience with her drivers license which insisted on merging her first initial and middle name (the format she prefers to use) to create a new non-name that didn’t match her other details. That error then propagated onto her tax information and appeared on a refund cheque she received.

In short, it would seem she might have a problem voting (if her drivers license and tax records haven’t been corrected since).

Accuracy of Master Data, and consistency of Master Data

The anecdote above highlights the need for accuracy in the master data that the voter lists are being validated against. For example, the Washington Post article cites the example of Wisconsin, which flags voters data discrepancies “as small as a middle initial or a typo in a birth date”.

I personally don’t use the apostrophe in my surname. I’m O Brien, not O’Brien. Also, you can spell my first name over a dozen different ways (not counting outright errors). A common alternate spelling is Darragh, as opposed to Daragh. It looks like that in Wisconsin I’d have high odds of joining the four members of their 6-strong state elections board who all failed validation due to mismatches on data.

In Alabama, there is a constitutional ban on people convicted of felony crimes of “moral turpitude” voting. The Governor’s Office has issued one masterlist of 480 offences, which included “disrupting a funeral” as a felony. The Courts Administrator and Attorney General issued a second list of more violent crimes to be used in the voter validation process. Unfortunately, it seems that the Governor’s list was used until very recently instead of the more ‘lenient’ list provided by the Courts Administrator.

Combine this with problems with the accuracy of other master data, such as lists of people who were convicted of the aforementioned felonies and there is a recipe for disenfranchisement. Which is exactly what has happened to a former governor (a Republican at that) called Guy Hunt.

In 1993 Mr Hunt had been convicted of a felony related to ethics violations He received a pardon in 1998. In 2008 his name was included on a “monthly felons check” sent to a county Registrar. Mr Hunt’s name shouldn’t have been on the list.

According to the Washington Post article, Mr Hunt isn’t the only person who was included on the felon list. 40% of the names on the list seen by the Washington Post had only committed misdemeanors. In short, the information was woefully inaccurate.

But it is being used to de-register voters and deprive them of their right tohave their say on the 4th November.

The Washington Post also cites cases where US citizens have been flagged as non-citizens (and therefore not entitled to vote) due to problems with social security numbers. Apparently some election officials have found the social security systems to be “not 100% accurate”. But this is the reason why they are supposed only to be used when the state systems on their own are insufficient to verify the voter. That’s the lawapparently).

Continue reading

Super Tuesday – Gloomy e-voting Wednesday

The Register has highlighted not one but two potential information quality problems with the ballots being cast in the pre-Presidential primaries in the US.

Firstly, Democrats Abroad decided to support Democrats living overseas by letting them vote on-line. This sounds like an excellent idea, this ‘electronic voting’. However, there seem to have been some concerns with the way the ballot was conducted. Apparently the ‘receipt’ that was produced to evidence the voter’s choice just showed the choice, with no other reference that could be used to support an audit. And to cap things off, when one voter tried to print her ‘receipt’ all she got was a blank sheet of paper.

David Dill and Barbara Simons are two experts in the field who wrote a nice piece on this precise risk on Monday over on www.news.com

Of course, it’s not just internet voting that is all a-jitter. With electronic voting being a much used technology in the US, it was timely that a report was issued by two voting advocacy groups that highlighted that six of the twenty four (25%) States are at high risk of malfunction of or tampering with their e-voting machines , with a further 5  States being at medium risk. That’s almost 50%.

Sheesh. It’s a good job that the stakes are so low with that level of risk in the process and with the audit trails not really being audit trails and the receipts printing out blank.

It is, after all, only a race for the Presidency of the United States. Surely the expectation of accuracy and completeness in those ballot counts will be low?

So, they’ve got guns and are trained to kill…

let’s screw with their pay…

From the erstwhile The Register comes this story about on-going information quality problems in the British Forces pay and personnel system. There have been complaints of pay being withheld for months.

The MOD blames the data input monkeys  staff and insists that the system is working fine.

“Input errors based on a degree of unfamiliarity with the new scheme have resulted in a small number of pay inaccuracies,” according to the MoD.

“Thorough investigation of these errors has shown that the JPA system is working extremely well… JPA… requires accurate and timely input from… HR administrators.”

Additional training measures are, of course, being provided to staff and the MoD is keen to point out the long term benefits of the system in terms of reduced manpower needs in HR and fewer systems to maintain.

Of course, the complexity of the payroll system should not be underestimated, particularly if there are staff at similar grade with differing pay structures. Add to that the requirement for rock solid security (given the sensitivity of the information) and the system requirements become even more complex.

However, basic validation and verification of information (perhaps a reconciliation between the new system and the old system at data migration) might have mitigated this problem.

Why is this an IQ Trainwreck? Well, they’ve pissed off members of some of the most elite fighting units in the world… not something I’d do.

More people registered to vote than live in Ireland

The background

An Irish Sunday Newspaper broke the story in 2006 that there were up to 860,000 more people registered to vote in the Irish Republic than actually lived there. To put that in perspective, the number of persons resident in Ireland who were of an age to vote was only approximately 2.6 million or so. This represented a significant issue.

The approach of the Irish Government to the issue was to dispatch personnel to go door to door checking voter registrations. This was a form of scrap and rework. It was conducted over a period of approximately 3 months in late 2006. The work practices involved in this review varied betweenlocal government areas . In the electoral constituency of the Minister for the Environment (who has ultimate responsiblity for the Electoral Register) at least one entire housing estate (of a few hundred houses) ‘disappeared’ off the Electoral Register.

The litany of issues is too long to go into here… check out my personal blog site for some more background.

Why is this a trainwreck?

There are a variety of reasons why this is an Information Quality Trainwreck:

  1. It has a fundamental effect on a key process in democracy.
  2. It would appear that divergent processes, poorly defined processes and a failure to define and manage processes in a way that reflect ‘life events’ that might change the electoral register was part of the root cause for the problem.
  3. There was a focus on scrap and rework to address the issue. There has been no substantive or tangible official review of the root causes for these problems. There are some anecdotes however of Electoral Register clerks in some parts of the country using the Obituary pages from local papers to identify people to be taken off the Register as they didn’t know that there was a central register of Deaths who could provide them that information.
  4. The ‘tone from the top’ was one of creating fear and spreading blame. The Government Minister in charge berated local authorities for not doing a good enough job. However it seems that there was a fundamental failure to provide the local authorities with the tools and processes they needed to do that job.

Current State

In Ireland we are less than a month away from a General Election. Our Electoral Register is now known to be flawed and innaccurate. The root causes have not been addressed and whatever ‘clean-up’ was achieved through the manual scrap and rework will have degraded as it is now over 6 months old.

The information does not meet or exceed our expectation and there is a fundamental risk to the quality of our elections.