Tuesday, October 19, 2010

rsnapshot - it's for pulling, obviously!

It's been a few months since I learned about rsnapshot, a neat little utility that uses hard linking to create multiple point-in-time snapshots of your data. It combines the neat features of rsync into a package that does more than just mirror data.

If you've ever had clients who suffer from "data decay", you'd know that just having the most recent version of a file can be useless. An Excel spreadsheet could have been corrupted weeks ago, and people will continue to write changes to it--despite warning messages--until it really fails hard, which is always too late.

Conceptually I was having a hard time understanding the topology of rsnapshot server and the clients it's supposed to back up. Are you supposed to have rsnapshot on every client machine, connecting to the server using only rsync? Actually, no.

Quite foolishly I was modifying the snapshot_root directive in /etc/rsnapshot.conf to point to a network location (CIFS share), even though the comments clearly state that it's supposed to be a local root. I guess this should have been a no-brainer, but in my skimming of the documentation it wasn't clear why I couldn't set the snapshot_root to be a network location!

Only when I tried to use rsnapshot in conjunction with a TeraStation Live did I learn the truth! Behold, rsnapshot sits on the server, manages it's own backup root, and communicates with clients using only rsync, or rsync over SSH. This makes sense because the rsnapshot server must scan its local repositories for changes, and it does folder rotation. For example:


Say you set retention to 5 days. This means that the next daily backup will cause a rearrangement. daily.2/ becomes daily.1/ and daily.1/ becomes daily.0/, and daily.0/ disappears. On a local filesystem, these reflect simple inode reference changes for moves and renames, but on a remote FS, all bets are off.

On a related note. I'll have a quick guide on how to get root access to a TeraStation LIVE, and how to install rsnapshot.

A second post will go over how to initiate a VSS snapshot of a Windows drive over SSH in preparation for a pull backup from a Windows client to a Linux server.

Monday, October 4, 2010

Mac OS X filesystems - Conspicuously lacking

I guess I'm a bit spoiled. Linux never leaves me hanging when I need to access a filesystem. Anything I throw at it--NTFS, FAT32, HFS+, ext2, ext3, XFS, ZFS.

The other day I was trying to help my sister install dual boot Ubuntu with Mac OS X Leopard (10.5). Apple's disk utility couldn't seem to resize the main HFS+ partition, claiming that there wasn't enough free space (even though 12GB was available). I figured it was a system file in use, so it was off to burn a 10.5 bootable DVD--because she's like everyone else and lost her original one. Who ever saves these things? I'm still dreaming of the day a client answers, "Yes!" to the question, "Do you have the original install CDs?"

I have an ext3 formatted disk with the OS X .iso file on it, and needed to use a Windows system that had a DVD-DL drive (the Mac OS X disk is 7.5GB). Even with the software from DiskInternals, couldn't get Windows to read the partition (the inode size was 256KiB instead of the 128 that Linux-Reader expected.) I had another Mac OS X install image on the Mac itself, but we find out that Mac OS X can only write FAT32, not NTFS, and with its 4GB filesize limit, it's impossible to copy the 7.5GB ISO to a flash drive. So the only filesystem you can use to copy 4GB+ files to other machines is HFS+!

Filesystems that support 4GB+ file sizes:
  • Outgoing from Mac
    • HFS+
  • Outgoing from Windows
    • NTFS
  • Outgoing from Linux
    • NTFS, ext2, ext3, xfs, HFS+ (journal disabled)

Look how flexible Linux is! I guess I thought that with its UNIX heritage, Mac OS X would have these extra filesystem drivers included "no charge", "for good will". Perhaps it's denial, as Apple has got its own little ecosystem you aren't supposed to stray from...

FYI, for $31USD, you can get NTFS for Mac, based on the GPL ntfs-3g software widely used in Linux. It may just be worth it. Personally, I'd just ditch Mac OS.

Windows Server Backup - MS gives you less

I have a dream. One where I can backup a server with minimal fuss, with desirable and delicious features like backing up locked files, storing multiple points in time, only backing up changed sectors, and having a command line tool to control jobs.

Believe it or not, there are tons of vendors out there selling backup software that recopies entire files when only a few bytes have changed. So with VM images, this could mean tens of GBs of unnecessary data copies every time you do a backup. Kind of a bummer.

Anyone who's tried to run ntbackup.exe in Server 2008 has discovered the buried, curious, new tool called Windows Server Backup. Maybe this is what we were looking for all along? Maybe not.

Great stuff that MS removed since ntbackup.exe:

  • Backup to tape -- no longer supported, you need a 3rd party backup program to do this
  • Backup to network (CIFS) share -- no longer offered, see note below

A work around for the network share feature is to use the wbadmin.exe tool to manually execute backups to network destinations with the huge limitation that it will completely overwrite the previous backup that was stored there. So for 1TB of data, you are copying 1TB each day!

Other "features"

  • If you select a complete system backup (suitable for bare-metal restore), then all drives on my server are selected because Windows thinks that there are system files on my E:\ drive. But Server Backup won't tell you what/where they are!

Luckily, there are a few bits of candy that Microsoft is going to tease you with:

  • VHD format backups
  • VSS support for getting data in a consistent state (Hyper-V and SQL Server)
  • It's free with Windows Server

I'm a nerd, and I get nerd-fanciful about the fact that VHD is the new backup format. I could conceivably use qemu-img to convert the VHD files to a raw disk image suitable to dd to a new server from a Linux live CD. Seeing as it's used in Hyper-V as well, I can see a lot of 3rd party developers offering tools to recover data from VHDs in the event of corruption.

Having said all of this, the About dialogue shows Windows Server Backup @ v1.0, so maybe all this will be fixed the next time around. How about you? Have you found the perfect backup tool that has the features mentioned at the top of the article?