Wednesday, November 2, 2011

Hadoop NextGen


Hadoop NextGen, the next major revision to the Hadoop MapReduce framework, addresses several issues with Hadoop’s scalability and design. Above all, the current version of Hadoop doesn’t scale past 4,000 nodes and the current architectural design makes it difficult to fix this and other problems without a fundamental overhaul of the code base.

Hadoop NextGen divides the responsibilities currently handled by the job tracker into two separate components. The Resource Manager globally assigns compute resources to applications, while the per-application Application Master manages scheduling and coordination (an application is either a single MapReduce job, or a DAG of MR jobs).

Reading this blog post after Mesos, it seems like Nextgen takes an approach that is quite similar to Mesos, though at a different level of abstraction. The ResourceManager acts much like Mesos’ scheduler, coordinating the global assignment of compute resources to applications (though without Mesos’ resource offer model). At the same time, Nextgen is at a lower level of abstraction – in principle, it could itself run on Mesos, sharing resources with, say, Spark, MPI, and other frameworks. That said, while Mesos could enable multiple Hadoop instances to run over the same cluster, this would not fix the scalability issues inherent in the current version, since Hadoop would still not be able to scale over 4000 nodes for a given job. Thus, Nextgen still fills a need that Mesos alone (or another resource scheduler) could not necessarily satisfy.

As far as tradeoffs go, certain parts of the Nextgen framework are still centralized (e.g. the resource manager), so this could become a bottleneck for performance in the future. Despite this, Nextgen appears very promising and makes many significant performance and scalability improvements on the ever-popular Hadoop framework, which should guarantee continuing popularity for Hadoop (that is, if everyone doesn’t switch to Spark first!).

No comments:

Post a Comment