hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Schmitz <Christoph.Schm...@1und1.de>
Subject Under-replication warnings for Distributed Cache?
Date Mon, 15 Aug 2011 13:40:15 GMT

we're running an 8-node Hadoop cluster with CDH2. Recently, our monitoring tools caught warnings
like this one when fsck'ing the HDFS:

/tmp/hadoop-tgp/mapred/system/job_201105191458_1857/job.jar:  Under replicated blk_-6996370258385460742_366223.
Target Replicas is 10 but found 8 replica(s).
// Lots more like it on every file in the Distributed Cache.

Obviously, this means that the default replication factor of mapred.submit.replication=10
cannot be reached since we only have 8 datanodes. I found the place in the code (JobClient.java)
where this property is consumed and used for replicating the job jar and the Distributed Cache,
so I understand (kind of ;-) where the warning comes from.

Still, I have two questions: Shouldn't there be an automatic limit of mapred.submit.replication
to the number of data nodes? And more generally, should I worry about this warning?
Thanks and best regards,


Christoph Schmitz

1&1 Internet AG
Ernst-Frey-Straße 10 · DE-76135 Karlsruhe
Telefon: +49 721 91374-6733

Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn,
Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren

View raw message