A quick personal note, yesterday I was looking for a good secondary backup solution from my personal files. I currently have an external drive attached to my main machine, and every morning, cron faithfully rsync’s some critical directories to that external drive. This is only as good as:
- the filesystem layer on Mac OS X goes crazy and trashes my drive (unlikely).
- my house gets burglarized (likely)
- I make a mistake, delete files that I should not delete and don’t realize it within at most 24 hours (likely).
Hence I was looking for yet another copy to have around, in case one of the three events occur. The three contenders were:
bingodisk is cheap (2$ per GB per year), has fairly high bandwidth caps and a nice webdav interface, but rsync does not play well with it. strongspace has rsync but is too expensive (15$ per month only get you 5GB). Both suffer from the fact that they are offered by a small company, joyent. Sure they use Solaris, ZFS and Thumpers — granted the combination is hot, but how much of a guarantee is that they protect the data from harm, or, for that matter, that they can stay in business long enough. In the same categories fall xdrive, box.net and the likes.
There remained S3 as a storage option from a company that will stay around for a while, can obviously operate a large-scale computing environment (I’m not saying that others don’t, simply that they have yet to demonstrate the same scale).
Now I have about 100 GB of data to back up. Based on S3’s prices, it comes down to 200 dollars per year, that’s great. Granted the interface is much more geared toward a non-file-based programmatic interaction so you need some third-party tool to really use it as a backup tool. rsync does not work well with it (I tried it with JungleDisk) but there is a ruby-based replacement called s3sync, which works quite well and optimizes the amount of transferred data. Of I went with a little script based on that blog entry.
Then I realized it would take a couple of weeks just to upload all my files… So I ended up buying yet another external hard-drive that I’ll refresh once a week and store in another location.
The prospect of storage as utility is exciting nonetheless.
Similar in functionality to s3sync is a Pythonic s3cmd sync.