L'Ombre de l'Olivier

The Shadow of the Olive Tree

being the maunderings of an Englishman on the Côte d'Azur

21 April 2008 Blog Home : April 2008 : Permalink

All Your SQL Data Are Belong To Us

Thanks to the Register I came across this Daily WTF article and some of the follow up. In brief there would seem to be a lot of people, including any one unfortunate enough to have even been on Oklahoma's Sex Offender's register (and I believe many other lists of criminals and possibly even prison officers/law enforcement personnel) who may have their personal information available on the world wide web to anyone who can modify a basic SQL query. As the reg article explains:

The unfortunate developers executed several critical errors in establishing the site. First, they allowed a database with sensitive content to face the outside Internet. Second, they allowed queries from the website to access any of the information in that database. Third, they placed the SQL queries required to extract information inside GET requests from the browser (most easily recognised as the part of the URL after the question mark(?) if it is there).

Finally, and possibly most critically, they did not perform any filtering of the anonymous GET requests, happily executing the SQL requests and returning the results.

There are also a lot of other entities who, whether or not they have sensitve personal information or not, could well be letting themselves in for a nasty surprise if a bored script kiddie who understands SQL were to come along. A script kiddie who realized that SELECT is not the only SQL command there is and who therefore might decide to adapt an idea from xkcd:
Little Bobby 'Drop' Tables

How can this happen? It all boils down to people trading convenience for time/effort and not realizing that they are also trading away security for convenience. Essentially all these web pages are simply querying a database and so the developers decide that rather than create a middle layer that takes the underlying SQL query and turn it into something shorter that is suitable for a list of URL links on a page (e.g. a list of cities by ZIP code say) and then have the other middle layer that sticks the missing bits of the query back before producing the result why not just stick the whole SQL query out there and skip the code that takes the URL including (say) zip code and adds the rest of the query. Let me illustrate.

Say you have a link to (this is using the Oklahoma sex ofeenders register) a list of Oklahoma cities including Mustang (Zip 73064) and Muskogee (zip 74401 and 74403) you might decide to create a page with a list like this

Sex offenders by Zip code

If you click those links you see that all that happens is I pop up an alert tellign you the zip code. If this were a real application then clicking on the link would lead to a web server getting the information (74401 or whatever) and processing it, perhaps by converting it to an SQL query like this:

select distinct o.offender_id,doc_number,o.social_security_number,o.date_of_birth,o.first_name,o.middle_name,o.last_name,o.sir_name,sor_data.getCD(race) race,sor_data.getCD(sex) sex,l.address1 address,l.city,l.state stateid,l.zip,l.county,sor_data.getCD(l.state) state,l.country countryid,sor_data.getCD(l.country) country,decode(habitual,'Y','habitual','') habitual,decode(aggravated,'Y','aggravated','') aggravated,l.status,x.status,x.registration_date,x.end_registration_date,l.jurisdiction from registration_offender_xref x, sor_last_locn_v lastLocn, sor_offender o, sor_location l , (select distinct offender_id from sor_location where status = 'Verified' and upper(zip) = '$zipcode' ) h where lastLocn.offender_id(%2B) = o.offender_id and l.location_id(%2B) = lastLocn.location_id and x.offender_id = o.offender_id and x.status not in ('Merged') and x.REG_TYPE_ID = 1 and nvl(x.admin_validated,to_date(1,'J')) >= nvl(x.entry_date,to_date(1,'J')) and x.status = 'Active' and x.status <> 'Deleted' and h.offender_id = o.offender_id order by o.last_name,o.first_name,o.middle_name&sr=yes

Each different link above would pass a different $zipcode value and so if the user clicked on Mustang the query would be

select distinct o.offender_id,doc_number,o.social_security_number,o.date_of_birth,o.first_name,o.middle_name,o.last_name,o.sir_name,sor_data.getCD(race) race,sor_data.getCD(sex) sex,l.address1 address,l.city,l.state stateid,l.zip,l.county,sor_data.getCD(l.state) state,l.country countryid,sor_data.getCD(l.country) country,decode(habitual,'Y','habitual','') habitual,decode(aggravated,'Y','aggravated','') aggravated,l.status,x.status,x.registration_date,x.end_registration_date,l.jurisdiction from registration_offender_xref x, sor_last_locn_v lastLocn, sor_offender o, sor_location l , (select distinct offender_id from sor_location where status = 'Verified' and upper(zip) = '73064' ) h where lastLocn.offender_id(%2B) = o.offender_id and l.location_id(%2B) = lastLocn.location_id and x.offender_id = o.offender_id and x.status not in ('Merged') and x.REG_TYPE_ID = 1 and nvl(x.admin_validated,to_date(1,'J')) >= nvl(x.entry_date,to_date(1,'J')) and x.status = 'Active' and x.status <> 'Deleted' and h.offender_id = o.offender_id order by o.last_name,o.first_name,o.middle_name&sr=yes

etc.

The important thing is that when you saw the result page all you would see is some url like http://docapp8.doc.state.ok.us/pls/portal30/url/page/sor_roster?zipcode=73064
and therefore it wouldn't matter if you modified the URL by changing the 73064 to (say) 74401. Now sometimes even this is a bad idea: if it weren't zip code but were (say) Social Security Number simply responding to the query would likely include additional private information (such as name and address) so we would prefer it if someone were not able to hand craft queries. However a lot of the time this is a perfectly harmless thing. When I looked up my semi-marathon results on line yesterday I was able to also look up other people's results if I knew their race number. But it isn't necessarily good.

However in the Oklahoma case it was a lot worse than not good because instead of just the zip code you got the entire SQL query (like this)

Sex offenders by Zip code

And that means that someone who is curious can try feeding all sorts of other valid SQL query statements in and seeign what they get. When they get ot right they get a list of all the details of everyone in the database table in question. And with a little more playing get the details of every table in the database and hence all the other data in the database.

This sort of problem is, unfortunately, rather common and thanks to the internet what were once minor embarrassing internal breaches where the only people who could take advantage were bored employees, now thanks to web portals and the internet everyone else can too.