This paper investigates a
number of issues associated with scaling databases and key/value stores on
cloud computing infrastructure, along with an overview of current research
addressing these problems. Scalability is the major problem, since RDBMSs have
not traditionally been designed to scale out horizontally very well past a few
machines. On the other hand, key-value stores scale very well, but providing
multiple-key transactions on them is difficult. They call this problem of
multi-key atomicity “data fusion.” The paper’s authors have designed G-store to
provide transactional multi-key access guarantees over dynamic, non-overlapping
groups of keys using an ownership leader/follower key abstraction. In contrast,
“data fission” is the problem of sharding a database into relatively
independent partitions. Migration of data between partitions is another
challenge, both for shared storage and shared-nothing DB architectures. For
shared storage, the authors have designed an “Iterative Copy” technique that
transfers the main memory state of the partition to avoid warm-up time at the
destination. For the shared-nothing architecture, the persistent image of the
database must also be migrated, which is usually much larger than just the
memory state copied in “iterative copy”. To accomplish this task, Zephyr
introducs a synchronized phase that allows both the source and destination to execute
transactions for the tenant while the data is migrated, using a combination of
on-demand pull and asynchronous push of data, minimizing the window of
unavailability.
The
problems addressed by this paper are real problems. Scaling databases is difficult
– and there don’t seem to be any great solutions to repartitioning and live
migration in sight, though the techniques mentioned can minimize the problems.
Transactions on multiple keys in a key-value store are also important. That
said, this paper read like a laundry list of problems and solutions and I found
the overall structure somewhat incoherent. The paper itself doesn’t seem to
introduce anything new and is just an overview of current research, so I somewhat
doubt that the paper itself will be influential.
No comments:
Post a Comment