hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francesco Salbaroli (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4586) Fault tolerant Hadoop Job Tracker
Date Thu, 18 Dec 2008 10:48:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Francesco Salbaroli updated HADOOP-4586:
----------------------------------------

    Attachment: jgroups-all.jar
                HADOOP-4586-0.1.patch

This is a very preliminary (and tested only locally) release of Fault tolerant Hadoop.

The hadoop source tree is only slightly modified in the org.apache.hadoop.mapred.TaskTracker
class.

The package containing the fault tolerance wrapper is org.apache.hadoop.mapred.faulttolerant.

The JGroups library (jgroups-all.jar) must be copied in the lib/ folder.

To run the FT version of Hadoop:
1) Configure hadoop-site.xml to match the environment
2) Format the HDFS filesystem ($HADOOP_HOME/bin/namenode -format)
3) Run the HDFS daemons (to run locally $HADOOP_HOME/bin/start-dfs.sh)
4) Run one or more instances of FTJobTracker ($HADOOP_HOME/bin/hadoop org.apache.hadoop.mapred.faulttolerant.FTJobTracker)
5) Run one or more instance of FTTaskTracker ($HADOOP_HOME/bin/hadoop org.apache.hadoop.mapred.faulttolerant.FTTaskTracker)

Regards,
	Francesco Salbaroli



> Fault tolerant Hadoop Job Tracker
> ---------------------------------
>
>                 Key: HADOOP-4586
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4586
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.18.0
>         Environment: High availability enterprise system
>            Reporter: Francesco Salbaroli
>            Assignee: Francesco Salbaroli
>         Attachments: FaultTolerantHadoop.pdf, HADOOP-4586-0.1.patch, jgroups-all.jar
>
>   Original Estimate: 2016h
>  Remaining Estimate: 2016h
>
> The Hadoop framework has been designed, in an eort to enhance perfor-
> mances, with a single JobTracker (master node). It's responsibilities varies
> from managing job submission process, compute the input splits, schedule
> the tasks to the slave nodes (TaskTrackers) and monitor their health.
> In some environments, like the IBM and Google's Internet-scale com-
> puting initiative, there is the need for high-availability, and performances
> becomes a secondary issue. In this environments, having a system with
> a Single Point of Failure (such as Hadoop's single JobTracker) is a major
> concern.
> My proposal is to provide a redundant version of Hadoop by adding
> support for multiple replicated JobTrackers. This design can be approached
> in many dierent ways. 
> In the document at: http://sites.google.com/site/hadoopthesis/Home/FaultTolerantHadoop.pdf?attredirects=0
> I wrote an overview of the problem and some approaches to solve it.
> I post this to the community to gather feedback on the best way to proceed in my work.
> Thank you!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message