hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan F <dfranko...@gmail.com>
Subject Please define blacklisting, graylisting, and excluded nodes in Hadoop 1.0.3
Date Fri, 22 Feb 2013 06:28:08 GMT
At the top of the job tracker in Hadoop, it reports blacklisted,
greylisted, and excluded nodes. (We are using Amazon EMR AMI 2.3.1, which
is Hadoop 1.0.3 I believe.)

Hadoop docs<http://hadoop.apache.org/docs/r1.0.4/cluster_setup.html#Monitoring+Health+of+TaskTracker+Nodes>
say
nodes can be blacklisted by a monitoring script. It does not say if there
is a default monitoring script, or what it might do.
mapr<http://www.mapr.com/doc/display/MapR/mapred-site.xml#mapred-site.xml-mapred.max.tracker.blacklists>
says
a task tracker is blacklisted if a node is blacklisted by
mapred.max.tracker.blacklists jobs. (It says a task tracker is blacklisted
from a job if it is blacklisted mapred.max.tracker.failures times in a job.)

So which is it: monitoring script; this blacklist-per-job, then across
jobs; both; some other mechanism? Is there a definitive source of this
information?

If I look in Jira (MAPREDUCE-1966) and the source code (JobTracker.java),
it looks as if nodes blacklisted as mapr described (4 times in a job, then
across 4 jobs) were changed to graylisting because there was debate over
the heuristics. However, it's unclear to me if that affects 1.0.3. "Fixed
version" in Jira shows "unresolved."

And what about excluded?

Please rigorously define blacklisting, greylisting, excluded nodes for
1.0.3, preferably with a ref. Thanks!

Mime
View raw message