Category Archives: General Trainwrecks

Google maps inaccuracies

We spotted this on Gawker.com. From my experience using Google Maps, it rings true (I recently was sent 15 miles out of my way on a trip in rural Ireland).

It seems that Google Maps has plotted the location of a tourist attraction in New Jersey right at the end of a driveway to a private residence. So, on the 4th of July weekend, the owners of the property had to fend off increasingly irate visitors who were looking for the lake and wound up in a private driveway.

So, the data is inaccurate and of poor quality. Google have responded to their error and replotted the location of the tourist area at the lake? Not yet, according to the story on Gawker.

Why 2k?

IT media sources are reporting that reports of the demise of the Y2k bug may have been premature. (see also here)

Systems affected included Spam control software and other security software from a leading vendor, network equipment from leading vendors as well as credit card payment systems in Germany and Australia, as well as (it seems) Windows Mobile. The bug was tweeted heavily on Twitter.

The effect of this bug seems to have been to catapult messages forward in time by a few years, resulting in credit card terminals rejecting cards as they failed date validation checks (the card expiry date was in the past apparently), valid emails being flagged as spam (because the message was date stamped in the future), and SMS messages appearing to come from the future.

The potential knock-on impacts of this error don’t bear thinking about. In the immediate term we have:

  • Embarrassment for credit card wielding shoppers who found themselves unable to pay for purchases or meals.
  • Missed emails due to them being flagged incorrectly as SPAM (although this has been fixed).
  • SMS confusion.

But, in this automated world where processes are triggered by business rules based on facts and information there are potentially other impacts:

  • Discovery of emails or SMS messages in criminal or civil litigation (will the lawyers think of looking in the future? Can the evidence be verified if it appears to be from the future?)
  • electronic transfer of data or funds based on rules
  • Calculation of interest payments or penalties based on date rules

The root cause of this problem appears to have been assumptions about dates, and the thought in 1999 that 2010 was sufficiently far in the future that (one must assume) everyone assumed that a better fix for the rules being applied would be developed by then.

These are the IQ trainwrecks in your neighbourhood

Stumbled upon this lovely pictorial IQTrainwreck today on Twitter. Thanks to Angela Hall (@sasbi) for taking the time to snap the shot and tweet it and for giving us permission to use it here. As Angela says on her Twitpic tweet:

Data quality issue in the neighborhood? How many street signs (with diff names) are needed? Hmmmm

Data quality issue in the neighborhood? How many street signs... on Twitpic In the words of Bob Dylan: “How many roads must a man walk down?”

The Retail Data Nightmare

Over at SmartDataCollective, Daniel Gent has shared an excellent example of a very common scenario in organizations across the globe… the troubling matter of the duplicated, fragmented and inconsistent customer details.

He shares a story with his readers about a recent trip to a retail store which used his telephone number as the search string to find his customer profile. The search returned no fewer than 7 distinct customer records, all of them variations on a theme. Daniel enumerates the records thusly:

1) One past owner of the phone from over 15 years ago;
2) Three versions of my name;
3) Two versions of my wife’s name; and,
4) One record that was a joint name account.

The implications that Daniel identifies for this are direct and immediate costs to the business:

  • Multiple duplicate direct mailings per year offering services
  • Multiple call centre contacts offering yet more services
  • Potential problems with calling on his warranty for the goods he bought because the store can’t find which of his customer records the warranty details are associated to.

Of course, this is just the tip of the ice-berg.

Daniel’s experience is all too common. And it is a Retail Data Nightmare that can quickly turn into an Information Quality trainwreck if left unchecked.

It’s the end of the world as we know it… or is it?

Yahoo is today carrying a story from AFP about a 13 year old German school boy who has corrected NASA’s calculations on the probability of ‘planet killer’ asteroid Apophis crashing into earth and causing a global catastrophe. The wunder-kind in question did his analysis as part of a regional science competition.

It seems that NASA forgot to factor in the affect of Apophis hitting one or more of the numerous satellites that orbit the earth in close proximity to the path the Asteroid will take on its next pass past the Earth in 2029. Apparently, if it hits a satellite the odds of Apophis hitting the earth in 2036 drop from a lottery-like 1 in 45,000 to a more troubling 1 in 450.

The IQ issue here is completeness of information. NASA failed to take into account the satellites in its risk model, resulting in a whopping understatement of the risk to the planet.

The short-term IQ Trainwreck comes about because the NASA scientists were corrected by a 13 year old. The long term IQ Trainwreck comes about because the 1 in 45,000 odds are probably firmly fixed in the minds of disaster recover planners around the world giving rise to degree of complacency, whereas a 1 in 450 risk might prompt some consolidated efforts to figure out how to properly manage the risk of Apophis hitting the earth or handling the ensuing global catastrophe.

However, more recent reports this afternoon suggest that the wunderkind may be a blunder-kind and may have based his model on some incorrect assumptions about the path of the asteroid. Oh dear.

In any event, the fact that a teenager is interested in this and that he researched the possiblity of the risks is commendable. Hopefully someone will take the time to reassess his work and determine if he is wrong or just not as right as he thought, improving the accuracy of the prediction models for Apophis.

Good quality information can help save the planet.  Poor quality information can send people unnecessarily into a panic.

Media trainwrecks (one of two)

Courtesy of Irish uber-blogger and technology journalist Damien Mulley come two excellent examples of poor quality information getting loose.

The first concerns an article published in the Irish Examiner Newspaper. They published a story this week which puported to show that Irish employers were losing millions of euro due to staff members using Social Networking sites like Bebo or Facebook. Mr Mulley found no fewer than five errors in the article, ranging from the fact that the survey they were referencing was a UK survey, and 50% of the respondents were interviewed in one location (which wasn’t in Ireland) to basic errors in mathematics in working out the cost to the Irish economy. As Damien helpfully points out (when he fixed his own factual errors due to miscalculations), that for the Irish Examiner’s figures to make any sense the average salary in Ireland would need to be over €120k a year.

 …take it from me… it’s not.

As Damien’s site is a blog there are some interesting comments which correct his calculations and provide alternate ways of calculating the costs to the Irish economy of Social Networking. None of them reach the same conclusions as the Irish Examiner.

 The second example will follow in the next post.

Dream host, Billing Nightmare

Courtesy (yet again) of The Register comes this case of poor Information Quality. It seems that US web hosting company DreamHost accidentally overbilled its customers for services due to what has been described as a “fat finger error”.

Full details of the good intentions that paved the path to this Information Quality Hell can be found on the company’s blog – they are refreshingly honest, if perhaps misreading the seriousness of tone that these type of issues require. Also some questions appear to be still unanswered (like how did some customers get billed twice for future dates). The ‘official story’ can be found on their Status site. On both sites the comments illustrate the impact on their customers.

Why is this an IQ Trainwreck? Well, by the company’s own admission, nearly every one of their customers has been overbilled. Many of these customers may have incurred additional bank or credit card charges if they have exceed overdraft or credit limit thresholds – which will probably have to be refunded by Dreamhost.

The root cause – a fat finger that created parameters for manual rebilling checks that were in the future… 2008 was the year keyed in, not 2007. And their billing software did not contain a business rule to either prevent or validate any attempt to bill for a future date.
Dreamhost fail to identify the need for a proof reading check to ensure that data going into a process (such as dates) fall within reasonable bounds for the process (choosing of course to blame the software). Of course, many of the 415 commenters on their blog have picked up on this simple step that could have avoided this IQ Trainwreck.

However, Dreamhost’s handling of the situation reveals another ‘cultural’ issue that means that these types of problems will recur. Their focus has not been on the customer – while some may appreciate the jokey tone of their blog post explanation, many of the commenters on their blog have condemned their ‘jokey’ if honest posting about the issue (it is perfectly OK for IQ Trainwrecks to joke about these things – we want people to laugh and then think – ‘oh, that could happen to me’).  As one commenter put it

“Hey, sorry your rent bounced, but here’s a picture of Homer Simpson and some lulzy hipster prose. Joking around might not be the best technique when you are messing with people’s money”.

Finally, it is an IQ Trainwreck because Dreamhosts competitors have jumped on the opportunity to steal business from them. One of their competitors has created a discount code for people switching hosting to them from Dreamhost which gives them savings on their hosting costs (with no guarantee they won’t be as clumsy with their billing I suspect).

This will be a costly one to put right.

Hidden data, hidden dangers

I have always been an advocate of speaking of “data” rather than of “databases”, and have always felt that hiding data within large integrated database systems is a danger not only to the quality of the data but to the owners of the data themselves: the customers.

A couple of recent events illustrate this very well.

My neighbour received an e-mail confirming that his telephone and internet cancellation request had been received and that he would be cut off in the summer of 2008. My neighbour had made no cancellation request. Calls to the call centre – as to any call centre – are to people whose access to data is severely restricted. They could not see who had made this request, why it was made, what the consequences might be. By the same token they could do little about the issue except make a note of the situation and to start a process to cancel the cancellation. These operators are never allowed to pass you on to somebody who has access to more information or who can take other actions.

Bad enough, and it makes me paranoid to think that people, for whatever reason, could take actions like cancelling my ‘phone on my behalf. But today my neighbour’s telephone was cut off, 10 months before the date and without reference to the fact that this was in error. The company’s system has been unable to make any connection between the command to cut off the line, the command to stop this and the demand of the customer to rectify the error.

In my own case, a certain person has requested a cable internet connection at my address, where she does not live and has never lived. This error is known to the cable company, yet they are unable to access their data in their systems to correct it properly. I have informed them, and presumably they now know where this lady really lives, because she would have complained about not getting the connection she requested (I sent the couriers with the hardware away with a flea in their ears). Yet letters continue to arrive from that company to the non-resident lady because of data quality and system integration issues which they seem powerless to correct.

The next step, I fear, is that the company will assume that the address correction is a house move and cut off my connection. Their system seems to allow two owners of a single connection, and nobody is aware that there is a data quality problem. Explaining this to the operators in call centres does nothing to resolve the basic problems.

My own ISP has made more errors in my account in the past 3 months than … but OK, you’re getting the picture.

By hiding the data within their systems these companies will never be aware that there is an issue to be resolved. As far as they are concerned the system is not throwing up error messages and there is therefore no reason to assume that the system is working incorrectly. The path between customer and data is long and protective walls have been built which prevent more than a limited amount of information about such errors reaching anybody who either cares or who can do anything about it. Losing my television for a period wouldn’t be a major worry. Losing my internet connection would be a much greater problem. By the same token we are all at risk of the inflexibilities of such systems if, for example, we get mistaken for terrorists because our names are similar to somebody else’s and access to the data to verify this is blocked.

How do we make companies aware that they have data quality and data systems problems? I wish I knew. Perhaps somebody from one of those companies (KPN, XS4ALL, UPC) with read this, will care, and will want to change things. If they do: contact me.

http://www.grcdi.nl

A timebomb or a trainwreck-in-waiting?

The BBC website carries this story today about a looming Information Quality problem. For information to be of ‘quality’ it needs to meet our expectations. One basic expectation is that you are able to get at the information. The British National Archives warns of a “digital dark age” as a result of obsolete file formats (does anyone remember using Wordstar?)  and obsolete media formats (5.25inch floppies?).

Research by the British Library suggests Europe loses 3bn euros each year in business value because of issues around digital preservation. This is the cost of information non-quality in just one national library (but at least that cost has been measured).

The National Archives in the UK already has ‘lost’ information because “because the programs which could read them no longer existed”. The BBC reports that the National Archives are already finding “an awful lot” of cases where information is lost and are concerned to make sure that it doesn’t get any worse.

A root cause identified in the article is the range of file formats that came into being at the very beginning of the Information Age.

But this issue doesn’t just affect large National Libraries or Archives. What about the information that is stored in businesses (this so-called ‘unstructured data’) which may exist in ‘in-house’ file formats or in file formats for applications which your organisation no longer uses? On a personal level, what information do you have on old format floppy disks or now-obsolete memory sticks? What family photos or important documents might you have lost?

What might the cost be to your organisation (or to you personally)?

While this may be an Information Management challenge, the impact on Information Quality is felt when information that is required and is expected to be retrievable cannot be located or recovered. Life is not like Star Trek and, unlike Star Trek, your technicians may not be able to recover information from the ‘alien’ file formats.