Monday, February 25, 2008

Another (Probably Better) Approach to Backup

My backup system is failing. The latency is getting to a month or more,
and it's very tiresome. I'm seriously thinking of switching to this hard
core strategy, which seems easy and effective, but costs money to get
going. I'm pasting the text here so I can always find it and it won't
disappear, but it's from here:
(lots of useful, mostly mac-specific comments below)


Dear Lazyweb, and also a certain you-know-who-you-are who should certainly
know better by now,

I am here to tell you about backups. It's very simple.

Option 1: Learn not to care about your data. Don't save any old email, use
a film camera, and only listen to physical CDs and not MP3s. If you have
no posessions, you have nothing to lose.

Option 2 goes like this:

You have a computer. It came with a hard drive in it. Go buy two more
drives of the same size or larger. If the drive in your computer is SATA2,
get SATA2. If it's a 2.5" laptop drive, get two of those. Brand doesn't
matter, but physical measurements and connectors should match.
Get external enclosures for both of them. The enclosures are under $30.
Put one of these drives in its enclosure on your desk. Name it something
clever like "Backup". If you are using a Mac, the command you use to back
up is this:
sudo rsync -vaxE --delete --ignore-errors / /Volumes/Backup/

If you're using Linux, it's something a lot like that. If you're using
Windows, go fuck yourself.

If you have a desktop computer, have this happen every morning at 5AM by
creating a temporary text file containing this line:
0 5 * * * rsync -vaxE --delete --ignore-errors / /Volumes/Backup/

and then doing sudo crontab -u root that-file

If you have a laptop, do that before you go to bed. Really. Every night
when you plug your laptop in to charge.

If you're on a Mac, that backup drive will be bootable. That means that
when (WHEN) your internal drive scorches itself, you can just take your
backup drive and put it in your computer and go. This is nice.
When (WHEN) your backup drive goes bad, which you will notice because your
last backup failed, replace it immediately. This is your number one
priority. Don't wait until the weekend when you have time, do it now,
before you so much as touch your computer again. Do it before goddamned
breakfast. The universe tends toward maximum irony. Don't push it.
That third drive? Do a backup onto it the same way, then take that to your
office and lock it in a desk. Every few months, bring it home, do a
backup, and immediately take it away again. This is your "my house burned
down" backup.
"OMG, three drives is so expensive! That sounds like a hassle!" Shut up. I
know things. You will listen to me. Do it anyway.

Update: Mac users: for the backup drive to be bootable, you need to do two

When you partition the drive, use GUID, not Apple Partition Map;
Get Info on the drive and un-check "Ignore ownership on this drive" under
"Ownership and permissions."
You can test whether it's bootable by holding down Option while booting
and selecting the external drive.


UPDATE: I have now implemented this system. Well not all of it - I don't have a backup backup hard drive. But I bought a spacious hard drive that is compatible with my computer's internal HD (not that swapping it in will be a piece of cake, but possible), and an enclosure with a fast connection (USB 2.0/Firewire). It wasn't too expensive. I use a program for the mac called Carbon Copy Cloner to automatically backup all changes every week at noon. This hard drive is also bootable - I tried it out. A nicer thing to have would be Apple's Time Machine, which keeps track of many old versions of your files too, but for my geriatric computer and price range Carbon Copy Cloner works well, if slowly, and there are no doubt thousands of similar applications for the PC (and if you're using Linux you're probably comfortable enough with using rsync on the command line). So I'm feeling a lot more secure about my data now. The critical points are that it's *automatic*, it backs up *everything* (don't have to manually add files to backup), and it's *bootable* (so I'm not stuck when my hard drive dies).


Matthew Skala said...

I did tell you in December that one of my own priorities in an archive system (not the same as backup, but closely related) was for it to work without requiring scheduled human intervention.

Daniel said...

Yes for sure, I now see that that's practically essential. What's the difference between an archive and a backup system as you see it? I'm intrigued.

Matthew Skala said...

Archive: I want to keep this for a long time, possibly forever. I want to preserve *this* version of it in its current state, even if I will later be using or preserving other versions separately. I'll refer to it from time to time. I'm protecting against the passage of time.

Backup: I want to make sure I don't lose the current version of this, as the current version gets updated. I'll keep a copy for a short time, probably to be replaced as soon as there's a new version (or maybe after N iterations). I'm only interested in the current version. I hope not to refer to it at all, but if I do, I'll expect to only refer to it once, when I replace the lost original. I'm protecting against short-term events that could cause me to lose the original.

Retention time is different; there's more need for fast updating of backups; there's more need for cataloguing and indexing an archive because it'll continue growing indefinitely.

Daniel said...

That is an excellent distinction, and adds some important points to my concept of shallow and deep backup. (deep being archive and shallow being what you call backup). Is there any special technological solution you use to keeping multiple versions of the same file, or just careful saving, naming etc?