hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <steve.lough...@gmail.com>
Subject Re: HA MRv1 JobTracker?
Date Sun, 17 Jun 2012 18:41:14 GMT
On 16 June 2012 23:23, Andrew Purtell <apurtell@apache.org> wrote:

> We are planning to run a next generation of Hadoop ecosystem components in
> our production in a few months. We plan to use HDFS 2.0 for the HA NameNode
> work. The platform will also include YARN but its use will be experimental.
> So we'll be running something equivalent to the CDH MR1 package to support
> production workloads for I'd guess a year.
>
> We have heard a rumor regarding the existence of a version of the MR1
> Jobtracker that persists state to Zookeeper such that failover to a new
> instance is fast and doesn't lose job state. I'd like to be aspirational
> and aim for a HA MR1 Jobtracker to compliment the HA namenode. Even if no
> such existing code is available, we might adapt existing classes in the MR1
> Jobtracker to models/proxies of state in zookeeper. For clusters of our
> size (in the 100s of nodes range) this could be workable. Also, the MR
> client could possibly use ZK for failover like the HDFS client.
>
> I'm trying to find out first the availability of such code if anyone knows.
> Otherwise, we may try building this, and so also I'd like to get a sense of
> any interest in usage or dev collaboration.
>
> Best regards,
>
>    - Andy
>


There's been work on JT recovery:
https://issues.apache.org/jira/browse/MAPREDUCE-3837

& Arun's added the ability of the JT to turn the JT offline and have it
stay up while the layers underneath undergo change (e.g. DFS outage,
networking ops).
https://issues.apache.org/jira/browse/MAPREDUCE-4328

These don't deliver HA, but they do bring resilience to failures in other
parts of the infrastructure, and should be used as part of the HA story.

Moving state to ZK could work, be interesting to test the performance. It
might be easiest to not save all state about ongoing work but only the
state of all completed tasks; on a restart the ongoing jobs would remain
live but the current workload would need resetting. Not ideal, but lower
cost

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message