I don't think I'm alone as a sysadmin in feeling guilt about saying to a client "Sorry, there's nothing that I can do." I feel like it's my fault...sigh.
Hardware is replaceable, data is not. One cannot make up data; it was created for a purpose, you pay your employees to create it, and your customers are expecting you to deliver it. If you need to make a shipment tomorrow and the important details were in an Excel file that was lost by a failed hard disk, all the money in the world won't bring it back, and more importantly, even if it did, would it be recreated in time for your deadlines?
Please, backup, and test that you can get
The Dinosaur that Won't Die - TapeSysadmins like myself really dislike tape backup, and this has been the 'go-to' technology for the last, say, 40 years? Linear tape really is an old-fashioned media. It is reliable, but difficult to manage. Since the data is stored in a long stream (linearly) along the tape, individual file restores are very time consuming. The fact that that tape is so cumbersome to write to means that you need a big backup software package to control what goes onto each tape to manage the indexes of which files are located where. The indexes themselves are not usually stored to tape, therefore they have to be recreated if you lose your server, which means manually spending hours feeding tapes and letting your backup software re-index all of the files on each tape.
Then of course there's the human factor issue with tape, where you have to train people to rotate tapes, keep to the schedule, report errors and possibly interact with the server to see status messages (dangerous!). This user(s) may also become "forgetful" and not take the tapes off-site or will forget to bring them back on-site to actually update the tape data. These elaborate retention and rotation schemes are just asking for trouble, and the incremental nature of most tape backups means that you have to have all of your tapes to get all of your data back, the most recent tapes will only contain recently changed files. Yikes!
Another thing to consider is that your disaster recovery plan probably also includes contingencies for if your building burns down. Unfortunately, tapes are no good without a compatible tape drive. They require special hardware to be read. Tape drives run from $2000-$4000 new, and aren't something you can pick up from your local computer fix-it store. This will delay your server/data restoration, unless of course you purchase an extra tape drive that waits and depreciates in an off-site closet for an event that hopefully will never occur.
So you have a lot of things working against you with tape:
- Specialized hardware
- Recreation of indexes
- Inability/difficulty in restoring
- Incremental backups
- Labourous and error-prone physical management of media
While tape is good enough for Google, I believe they have the money for nice automated tape libraries (think "tape jukebox") and dedicated personnel to manage their tapes. Most SMBs don't.
Here I'm going to talk about disk-to-disk backup, because for most of my clients, their Internet connections are not beefy enough to do real on-line backup, and furthermore most online backup houses will not ship you a hard disk when you need to restore everything, you have to download it all. For businesses that I work for, that have average of 1TB of data on shares and Outlook PSTs, that's just not reasonable.
So I recommend disk-to-disk backup to all of my clients.
More on Disk BackupDisk-to-disk backup is as simple as it seems. You simply use another hard disk(s) to backup the hard disk(s) in your computers and servers. We are up to 4TB density on 3.5" hard disks these days, so data density is very good, for a competitive price. In combination with modern block-level backup (instead of file-based like tape) only the changes to data are stored, and we can do these differential block-based backups granularly on the disk because hard disks are made for random access, unlike tape.
In my own business, I first I tried USB external drives with clients. USB drives always had to be replugged, would spontaneously change drive letters, would have their internal boards fail (Western Digital, I'm looking at you), and people would have this nasty habit of knocking them over while they were running, which would just toast them. On top of that the USB connectors would be destroyed by constant replugging, as they are really not designed for a high number of replug cycles.
Then I looked at purpose made devices like RDX drives. These seem like an intelligent solution until you notice that you need custom software and have to buy media from the manufaturer at inflated prices at storage capacities that are 12-months behind the storage curve. Plus RDX drives were 2.5" meaning they are reduced capacity to begin with, and the portability argument of 2.5" drives over 3.5" wasn't convincing to me.
How could we combine the reliability and robustness of tape with the random access performance and low price of commodity hard disks?
Highly Reliable Systems to the RescueAfter scouring the Internet for a few hours I came upon a company based in Reno, Nevada, making a wide range of devices including a nice appliance that houses 2 x 3.5" standard SATA drives in a nice hot-pluggable configuration. For extra usability they include LCD displays and LEDs right on the device to tell the end user the status of drives and replication.
They call it the High-Rely 2-bay AMT. I call it common sense.
The way it works is that one drive always remains in the AMT. To the OS, the AMT looks like any other eSATA drive. The swapping of drives is not visible to the OS, meaning no problems with backups being missed because drive letters change or because something wonky happened on the USB bus.
The trays are nice and beefy, and on High-Rely's site you can see them chuck a tray with disk inside off of a roof and then demonstrate that the hard disk still functions perfectly.
And what does the user do to manage the High-Rely? When they come into work, and both drives are in green state, just unlock either drive and remove it, insert another one, and watch the lights furiously blink until replication is complete. There is always one drive off-site, just like we IT people like.
All of the RAID1 replication is done inside the High-Rely, so there is no load to the OS, and no management of RAID or the replication process. However, it should be noted that doing a backup to the High-Rely while it is replicating between drives will increase the time until both drives are ready, and will slow the backup job as well.
The sleds are aluminum, with an LCD in front and a hot-swap connector on the back. They don't use the drive's actual SATA connector, to reduce the likelihood of damaging it through continued replugging. There are four screws holding the disk in the caddy, so you can remove and replace the drive with an higher-capacity one down the line. I asked High-Rely support about this and surprisingly they didn't threaten me with claims of invalidating the warranty, they actually laughed and said that's the point! When you need to recover, you can simply take the SATA drive out, connect it to a computer and get your data. What a concept!
Oh, and with Windows Server Backup (wbadmin.exe) you get unlimited retention (until disk is full), so each disk contains weeks of revisions of every file on your server. One client of mine has 800GB of data, and with daily "full" block based backups gets 12 days worth of complete snapshots on each 2TB sled. Very cool.
Some IssuesThe one thing I can complain about is optional host software is silly looking. I use it because it offers features like email notification and visual status separate from the front panel.
I called their HQ and spoke to the owner and asked him about integrating other software with this communication channel over SATA and he said the Chinese company that makes the RAID solution used in the High-Rely will not disclose the way that they are sending info over the SATA, so we're stuck with this.
It seems that after a power failure, the High-Rely goes into a state where it doesn't know how to replicate anymore. There is a process I have to go through every months with a client to get it back to normal. It is a simple fix, and no data is lost, I don't have to reconfigure or reset backups. With a proper UPS and stable power grid this would normally never happen, I think.
SummarySo, in conclusion, I always recommend this solution to clients if they have lots of data (over 500GB), and they are willing to shell out the approx. $900 for the appliance and 3 sleds with drives.
High-Rely also offers some more sophisticated NAS-based devices that have large swappable cages that have 3 drives in them. That's 3x4TB, or 12GB, for those in the graphics or video industry.