I believe it would be safe to say that, without exception, all of us have experienced that moment when we lose data. That sickening, gut wrenching, dreadful feeling….whether it is as a result of an application crash, the operating system going haywire, a disk suddenly developing amnesia or a laptop literally going up in smoke. Invariably, that is when we start thinking of backups. We may never have given it a thought before, but suddenly it’s right up there with “Did I record the last episode of The Big Bang Theory'”!
Now imagine for a moment that you are responsible for the backup strategy of Dropbox, or Office365 or Salesforce. What will happen if all those users lose their data? All those anguished faces, all the law suits, all the angry blog posts, the incensed Tweets….how will you sleep at night?
At Project Portfolio Office (PPO), we do not operate on the scale of Dropbox, but we do have thousands of users that depend on us to keep their data safe. Since PPO also includes a document management system, we are responsible for backing up and securing the hundreds of thousands of documents that users entrust to us. So how do we do it?
The answer, as with any good magic trick, involves mirrors. The theory is simple: for every document that you store, ensure that there is one or more mirror copies stored somewhere else.
Unfortunately, as with most things in life, the devil is in the detail. As soon as you have two copies of something there is a risk that the copies get out of sync. As soon as you transfer data, there is a risk that it may be intercepted in transit. When you want to delete a document, how do you get rid of it while at the same time protecting against accidental or malicious deletes?
Since Project Portfolio Office’s inception in 2003, we have gone through various iterations and revisions of our mirroring approach, most of which we “rolled” ourselves. I am not going to go into detail about that, but rather concentrate on our current approach, which is largely based on Amazon S3.
Amazon S3 is a file storage solution (and here I thought Amazon only sold books!) that provides 99.99 percent availability and 99.999999999 percent durability. How do they provide that sort of resilience? Well basically they do it by… you guessed it, mirroring. With mirroring at least three copies of the data are stored in three different data centres and continuous checking is done to make sure that if one copy fails, it is automatically rebuilt.
That’s awesome you may say, just switch your document storage to S3 and your job is done!
Well not quite, there are still some risks to address. What if a hacker, a disgruntled employee or a bleary eyed admin deletes all the documents (when you delete a file on S3 it deletes all the copies)? What if we introduce a bug that deletes all documents? What if Amazon decides to disable our account, because one of our customers uploaded something offensive or the accounts department forgot to pay the account? What if Amazon goes belly-up (bye-bye Netflix)?
To address these risks (and they are real, if you don’t believe me, Google the Code Spaces website) the answer is once again…you guessed it, mirroring. We therefore also maintain an independent mirror of the documents and also make use of some nifty features on AWS like MFA, IAM roles, bucket versioning, and lifecycle rules.
If you are interested in a more technical description of how exactly we do this, please get in touch!