As Super Bowl LIV approaches and the excitement of our local team heading to the final game ramps up the day to day chatter of winning, I wonder if there is anything to think about that relates to our own work activities. Here are my top 3:
1) Practice, practice, practice
Plans are OK but without practice and the development of muscle memory they are just plans.
Ask any world class team how well their “locker room” white board plans worked out on the field and you will find that without regular practice and detailed checking of the game plans, it might end up being nothing more than a hail mary. The easier it is to design the plays, run the plays, check them against the rules and refine them – the more likely you are to succeed when the clock is ticking and time + execution determines success.
How do you practice disaster recovery – and return to normal service?
2) Protect the primary assets
3-2-1 is not a countdown – it’s a decision on how much protection you need to keep the game going. What happens if you lose your primary quarterback?
In the data protection business it’s prudent to have 3 copies of your data – one of these is the current primary copy that you conduct business with, ideally these 3 copies are on 2 different types of systems or media to mitigate any platform failures, and at least 1 copy of your data is at another location to provide some geographic relief if needed. For today’s modern data centers, those 2 locations are your data center and your cloud.
I’m not going to get into the underlying implementation details of disk vs tape or snapshots vs mechanically extracted backups requiring extensive data movement. From a functional point of view there are many options that are OK. Just know there are differences in simplicity, cost and ownership as well as business drivers discussed below.
Once you figure out the copies, now add in your SLAs – RPO and RTO. Clearly the less data lost (RPO) and the sooner you get back into business (RTO) the better. The less time you have to wait to secure the 2nd and 3rd copies somewhere else and the quicker you can put the version you want back into live service the more likely you are to successfully fight whatever disaster strikes. Keep an eye on the clock at all times – Instant RTO anyone?
How and where is your data protected?
3) Play at scale
One size does NOT fit all – scale matters – individual focused actions of one or two are sometimes needed but the whole of the team must execute together.
Many businesses are more than small sample scenarios. They are complex, multi-component configurations sometimes with hundreds or thousands of moving parts. Design your battle plan for disaster recovery to operate at the proper scale to meet your business needs.
This may not require you to address every single part of the environment but my guess is the 80/20 rule may apply. Find the 20% of your infrastructure that accounts for 80% of the “fight” and get that nailed down. It will make dealing with the other 20% easier and increase success.
How complex are your critical operations?
Let me know what you think?