Thursday, July 16, 2015

Obergefell v. Hodges: the Database Engineering Perspective

Sam Hughes (epic 2008 post):

Altering your database schema to accommodate gay marriage can be easy or difficult depending on how smart you were when you originally set up your system to accommodate heterosexuality only. Let’s begin.

[…]

No matter how advanced and flexible your table structure, it will always be possible to create data which cannot fit into it. At that time, you will need to change your database. And the longer it’s been since you did, the less pleasant that’s going to be.

The lesson is not “prepare for every possible eventuality”. The lesson is to become comfortable and confident in modifying your schemata without losing data, and rolling back botched changes. Do this regularly, so that it becomes second nature. The lesson is to get used to change.

And what is true of our databases is also true of our world views. The future is vast and humans are creative. Things are going to happen which nobody could predict.

Sam Hughes (2015 update):

To investigate the specific ramifications of today’s ruling, however, here’s the schema we’re probably starting with:

[…]

Already the constraints on a schema like this are quite complicated. husband_id and wife_id are both foreign keys for column people.id. Check constraints ensure that the value of marriages.husband_id always points to a people row with gender set to “male” and the value of marriages.wife_id always points to a row with gender set to “female”. (Exactly how the gender column should be structured is outside the scope of this essay, but the values “male” and “female”, at least, should be available. Structuring the name column is even further out of scope, because yikes.) divorce_date is nullable. Probably there ought to be another check constraint which ensures that divorce_date doesn’t come before marriage_date.

It might be required to incorporate some sort of check for duplicate combinations of husband_id and wife_id… but then again, this could make it impossible for a couple to e.g. marry in 1994, separate in 2009 and then remarry in 2015.

[…]

But the more interesting thing is that you just incidentally let in a whole bunch of edge cases. Up until now, it wasn’t possible for an individual to marry themself. Now it is, and you need a new check constraint to ensure that partner_1_id and partner_2_id are different. Regardless of concerns about duplicate rows/couples remarrying, you also now have to contend with swapped partners: Alice marries Eve, and also Eve marries Alice, resulting in two rows recording the same marriage. This can typically be prevented by ensuring that partner_2_id is greater than partner_1_id, which would incidentally also prevent self-marriage as described above. Note that this could in turn invalidate previously-existing heterosexual marriages where the husband_id was lower than the wife_id. This constraint would have to be applied for future inserts only, or the disordered rows would need to be swapped.

Comments RSS · Twitter

Leave a Comment