Jim Bassett's Weblog comments

StorageMojo's look at Bianca Schroeder of CMU's Parallel Data Lab paper Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?

Storage is very conservative, so don't expect overnight change, but these papers will accelerate the consumerization of large-scale storage. High-end drives still have advantages, but those fictive MTBFs aren't one of them anymore.

Further, these results validate the Google File System's central redundancy concept: forget RAID, just replicate the data three times. If I'm an IT architect, the idea that I can spend less money and get higher reliability from simple cluster storage file replication should be very attractive.

By much different methodology I've come to the same conclusion for a big set of my data: forget RAID and replicate the data.

back to Jim Bassett's Weblog

Interesting.

If you need fast access or a "volume" bigger than what's readily available in one disc, RAID 0 may still have a place. If you need fast and fault-tolerant, seems like RAID 0+1 (with an offsite backup) might be the way to go. RAID 5 is the big loser according to this analysis.

For data integrity, I'm a big believer in geographical diversity. I try to keep a backup in a different county.

Also, regarding drive types, with electronics it's very often (but not always) true that mass production = reliability. Whatever they make the most of, they know how to make well.
- mark 2-21-2007 8:58 am

[home] [subscribe] [login]