Monday, September 26, 2011

“Megastore: Providing Scalable, Highly Available Storage for Interactive Services”

This paper describes a system developed at Google to provide a scalable storage system that achieves high reliability through synchronous replication and provides ACID semantics over partitions of data. The need for this arose because NoSQL datastores, while highly scalable, have limited APIs and loose consistency models that are not appropriate for all types of applications, while consistent replication across datacenters is difficult.

Megastore accomplishes this by partitioning the datastore and replicating each partition separately, providing ACID compliance within partitions, but only limited consistency guarantees across them, on the assumption that most Internet service data can be partitioned by user, etc. Write-ahead logging is used for transactions and concurrency control. Replication uses a low-latency implementation of Paxos that is optimized for reads, such that local reads can be performed on any up-to-date replica. 

Synchronous replication imposes some tradeoffs with the Paxos implementation: low numbers of writes per entity are necessary to avoid high conflict rates. In general this isn’t a problem for Google, since write throughput can be scaled through higher-grain sharding, but this still seems like a significant inconvenience.

I thought the combination of NoSQL and traditional DBMS guarantees was innovative and seems to provide good performance for many Internet serice-type applications, where data can easily be partitioned into entities (which need to be consistent). The emphasis on making data local to speed up reads seems particularly important, since most of these types of applications have high reads. I’m not sure that this paper or Megastore itself will remain influential over the next decade, but I do think we’ll see more of this type of merging NoSQL and traditional DBMS features as Internet app data continues to skyrocket.

No comments:

Post a Comment