FogBugz on Demand and Hurricane Sandy
That evening, the storm surge from Sandy, assisted by a rising tide and full moon, flooded the basement of our data center, cutting off the fuel supply to the backup generator. The next morning, Peer1 informed us of an impending emergency shutdown of the generator. We executed a protective shutdown to prevent loss or corruption of customer data. Later, when we’d secured confirmation from Peer1 that there was no imminent danger of power loss, we restarted our systems. The total duration of this unplanned, voluntary downtime was 3 hours.
FogBugz is a great service, and the team worked hard to minimize the downtime. The data center is still not on city power, but the service is running smoothly. Once things are back to normal, I think we’re owned an explanation for why there was only one data center. When I signed up for the hosted FogBugz service, it was based on Joel Spolsky’s 2007 description of their infrastructure:
Rather than setting up Los Angeles as a mere backup, we decided it would be completely live. Half our customers will be hosted from Los Angeles, and half from New York. That way we know at any time that both data centers are working and set up correctly, and we don’t have to wait until a massive failure to discover the problems with the backup data center.
Copies of the database backups are maintained in both cities, and each city serves as a warm backup for the other. If the New York data center goes completely south, we’ll wait a while to make sure it’s not coming back up, and then we’ll start changing the DNS records and start bringing up our customers on the warm backup in Los Angeles.
However, the current infrastructure page describes only a single data center.
Update (2013-03-05): Mendy Berkowitz:
All excellent questions, but as we debated the various answers we realized there was nothing wrong with our current datacenter that a second datacenter wouldn’t fix. In a disaster (natural or man made) situation, a second geographically diverse datacenter with a tested and practiced failover procedure is our best option for providing our customers with continued service.
1 Comment RSS · Twitter
[…] FogBugz On Demand originally had a backup data center, but this was shut down sometime before Hurricane Sandy. In the aftermath of Sandy, it sounded like the goal was to again have two data centers, but this […]