Monday, September 26, 2011

"The Google File System"

This paper describes a scalable distributed file system designed for large distributed data-intensive applications. It shares goals of previous DFSs, including performance, scalability, reliability, and availability, but also incorporates new trends: tolerance of component failure (from using commodity hardware), huge files, files are modified primarily by appends, co-designing the FS API and applications simultaneously increases flexibility. GFS provides the usual fs operations, though it is not a standard API. It also has snapshot and record append operations. A GFS cluster has a single master and multiple chunkservers, accessed by multiple clients. Files are divided into 64MB chunks and metadata is stored in the master. Clients contact the master and are directed to the appropriate chunk server and then interact directly with that server. GFS has a relaxed consistency model, so applications rely on appends rather than overwrites, use checkpointing, and write self-validating, self-identifying records. The system focuses on high throughput instead of low latency.



This file system was obviously designed to meet Google’s somewhat specific goals and workload characteristics. Many of the particular constraints (e.g. append-only files) don’t necessarily meet the needs of many applications. However, this solution is still highly relevant, since it is simple to scale and good for MapReduce-type application data (as evidenced by the popularity of the Hadoop/HDFS combo). GFS was novel in its focus on tolerance of commodity hardware (and failures), its focus on large files, and the more relaxed consistency guarantees (trade-offs between simplicity, consistency & performance). As the experience at Google has demonstrated, the design works and works well (for their needs). I think this will certainly be relevant over the next decade, although modifications will no doubt be made to tailor it to other tasks. The single-master approach seems to have limitations (as we heard in the Cloudera lecture), so we might see modifications to this.

No comments:

Post a Comment