Amazon S3 Cloud backup for the somewhat tech savvy
Recently, a tragedy hit one of the members on my team where he lost his home. Our team has rallied around him and his family and have done what we can to help – that’s just the kind of people I’m fortunate enough to work with. In talking to him, one of his regrets is that he didn’t have his photos backed up offsite. He said he looked into it, but then just didn’t get around to it. That was inspiration to get me moving…
I investigated a number of commercial solutions first, and the best I found was Carbonite. One yearly fee to backup all your documents, music and photos (no movies), $59. That is hard to beat for those with a significant amount of photos or music. (With the ever improving CCD imaging of digital cameras, everytime you buy a new camera, the photo files are larger. Is it a plot between the hard drive makers and the camera manufacturers? LOL) Sounds like a great deal, right?
The Carbonite app installed smoothly and ran well. It seems one key to their business model is to control bandwidth. Or, perhaps the service is very popular. They warn you that the initial backup could take several days. Well, after more that a week, mine was still less that 50% complete. About that time my trial period and patience both expired. If you don’t mind leaving your computer on for a month, this still looks like a very good option. They also have a switch in their UI where you can use less bandwidth on the upload. This will make the backup take even longer, but will allow the kids to still watch YouTube while you are taking your backups. Lastly, they have a web UI where your can explore your backed up files from anywhere. It’s a viable solution IMHO.
I started looking at other Cloud Backup solutions for my Mac (not the kind of cloud backup they get in Kentucky where it just seems to rain all the time). Amazon S3 seemed like the natural next choice to investigate, but what is it going to cost? Looking at S3 pricing, currently it runs:
Standard Storage | Reduced Redundancy Storage | |
---|---|---|
First 1 TB / month | $0.140 per GB | $0.093 per GB |
Next 49 TB / month | $0.125 per GB | $0.083 per GB |
Next 450 TB / month | $0.110 per GB | $0.073 per GB |
Next 500 TB / month | $0.095 per GB | $0.063 per GB |
Next 4000 TB / month | $0.080 per GB | $0.053 per GB |
Over 5000 TB / month | $0.055 per GB | $0.037 per GB |
So, this is more than Carbonite at my data volume, but more reasonable at the “reduced redundancy” pricing. Reduced redundancy is perfect for my use case since I backup all my files to an external hard drive already and this really is a disaster recovery scenario. So for me, this will run around $84 dollars a year. Still expensive, but S3 prices also go down at least twice a year historically. We’ll see how it works out. At the very least, it’s cool.
Another option worth considering is Amazon’s new “Cloud Drive“. The prices are lower than S3, with 5Gb free and other tiers at $10/Gb per year. The tools are a little clunky right now as it is really aimed at working with music. If you are mostly worried about backing up music, Cloud Drive makes it completely simple with their music upload and streaming tools. For other file types its a little more manual. But, the price it right.
Back to exploring S3. First, we need to check out tools available for managing S3. At this point, I was feeling very cheap since the storage costs are a little more than I wanted in the first place. There are some good tools out there like jungle disk that would likely make this much easier, but I was looking for cheap as opposed to easy. With jungle disk, you could take the complexity of the rest of this solution down considerably.
First step is to go to Amazon and create an Amazon Web Services account. You probably already have an amazon.com account and you can use the same login. Then login to Amazon Web Services and create an S3 bucket.
For syncing files to S3, I found an attractive free option in s3sync, a Ruby gem that gives us a command line way to sync between my Mac and S3. Here’s a great blog entry on the Ruby gem installation and config, so I won’t repeat that part. Then, to backup your files, use a command similar to this:
s3sync -r -v /Users/YOURUSERNAME/Pictures/iPhoto\ Library/Originals/2011 yourbucket:iPhotoBackup/Originals
The above will copy the photos out of iPhoto on your Mac that were taken this year (2011) into your bucket in the folder Originals. You’ll need to create the folder structure iPhotoBackup/Originals before executing this command. You could also leave off the “/2011″ and the /”Originals” like this to back up your entire iPhoto library, but this is going to take a very long time to upload to S3:
s3sync -r -v /Users/YOURUSERNAME/Pictures/iPhoto\ Library/Originals yourbucket:iPhotoBackup
With the -v option you see each file listed as it is uploaded. Like Carbonite, this will also take quite a while, and during the upload, a lot of your internet bandwidth will be consumed such that Netflix on demand, web browsing, etc will be slow for everyone in the house. Not surprising, just thought I’d throw that out there. This is a good reason to do it directory by directory perhaps overnight until you have it all complete.
The net step is very important to save you $$$s. You need to go to your Amazon Webservices Console, explore your S3 bucket, right click on the folder you just uploaded, and select Properties (or select Properties button at top right). From there, you need to select “Reduced Redundancy” and Save. This will then iterate through all the items in the bucket and mark them for reduced redundancy. There is no way to select this as the default for all files uploaded to a bucket. Hmmmmm, I wonder why? Greedy a bit Amazon?
If you are a Windows user, you may want to check out Cloudberry Explorer. They have a nice S3 interface that supposedly can mark each file for reduced redundancy after uploading for you. Looks like an interesting option.
There is quite a bit more to know about S3 than contained in this blog. For example, you can make selected files or folders public and hand out URLs, etc. Also, Amazon doesn’t charge you for transfer bandwidth on the upload, but does on the download. There are many other considerations to think through in choosing a cloud backup solution that is right for you, but hopefully you find this informative and useful.
3 Responses to “Amazon S3 Cloud backup for the somewhat tech savvy”
Use Scalable Cloud Based Services: Video PresentationTrack Your Cloud Spending Through Cloudability5 Cloud Computing Trends for 2012
Amazon Cloud Drive is just a front end for Amazon S3. I’ve been using S3 for a long time. It’s true that S3 doesn’t understand folders, but it’s easy to simulate them and preserve the organization of your data. I use s3sync to do it. No dealbreaker.
CloudBerry Lab is excited about Google announcements. Since we want to offer the most cost efficient product for our customers while allowing them to own their storage account we are considering to add an option to backup data to Google online storage in addition to Amazon S3. Stay tuned!