---------- Forwarded message ----------
Date: Mon, 6 Feb 2006 17:44:11 -0500 (EST)
From: Daniel Saunders <xxxx@qlink.queensu.ca>
I've been thinking about the personal backup system, and I wanted to put 
forward this idea, of a simple two-level backup:
  * Shallow backup. Purpose is to allow you to continue to work on your current 
projects, and not lose recent work (including recently finished projects) and 
correspondence etc. The window is about 2 years. This could be taken care of by 
an automatic snapshot system such as Jim has the benefit of. This would ideally 
be updated at least weekly.
  * Deep backup. Conceptualized as a single backup system spanning your *entire 
life as a computer user*. Across every computer system you have ever worked on 
(which is now technically possible), work, school, home, etc. We would expect 
to save in here things such as
    - completed projects (including data that went into old papers)
    - uncompleted projects, indefinitely suspended
    - personal records
    - long term personal reference material
    - personal history (souvenirs, emails etc)
So this window could conceptually be 40 years wide. It could be updated on a 
monthly, or term basis. It would also include everything that's in the shallow 
backups.
Now here's the major principle: that there should only be *one* instance of 
each, that is, one current shallow backup and one current deep backup, and each 
of them must be *complete*. So that doesn't mean there couldn't be multiple 
copies of your shallow and multiple of your deep - but unless they're throwaway 
ones, all the copies should be *identical* - that is, they must be synced.
I think the completeness is particularly important - with some of my adhoc 
backups I have left off things I believe to already be backed up somewhere 
else. Especially given I've not been very good at tracking my backup CDs, that 
puts you at very great risk of things falling through the cracks. Even if it 
takes 20 DVDs, or buying external harddrives, there should be only one: you 
always know where it is, you can take steps to protect it. And you don't have 
to worry about any others.
The only reasons I can think of for violating this rule is if you have to 
preserve a kind of very high volume data, or if you have to deal with a kind of 
data that apparently can't be integrated with the rest (eg Apple II floppies - 
I believe) But there's no problem with adding extra systems to deal with those 
issues, as long as they're conceptualized to fit under your deep or shallow 
backup. And of course this two-level system doesn't preclude adding more backup 
systems, for instance one for your ripped mp3s. But I feel this is what I would 
need to have peace of mind.
What do you think?
 
 
3 comments:
I'm pretty new to the whole backing up thing. What I do at the moment is I have an automated backup using rsync over ssh happen every couple of days to an account on my webserver.
There are probably more user friendly ways of doing this, but using rsync instead some sissy GUI program makes me feel 1337.
Does rsync allow you to select which directories to copy over, or do you just do your entire harddrive? Does it check to see which files have changed? Do you have to skip some larger files, like big movie files, and do you have a secondary backup system for those?
Yes to both questions. rsync uses the format
rsync -av [source folder] [destination folder]
(the -av switch means [a]rchive and [v]erbose)
The big thing with rsync, though, is that it not only checks which files have changed, but it only uploads the changed data. So if you have a 100k text file that you change one letter in, is only has to upload the one letter change, not the entire file.
I use this backup system only for my work files (currently ~3.5GB) which are primarily text files (HTML, PHP, etc) and images with some Photoshop and Illustrator files sprinkled in. My personal movie and TV show colection is stored on DVDs (probably over 1TB)
Post a Comment