hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Armstrong <john.armstr...@ccri.com>
Subject Re: Large startup time in remote MapReduce job
Date Wed, 22 Jun 2011 12:16:21 GMT
On Wed, 22 Jun 2011 00:15:56 +0200, Gabor Makrai <makrai.list@gmail.com>
wrote:
> Fortunately, DistributedCache solved my problem! I put a jar file to
> HDFS. which contains the necessary classes for the job and I used this:
> *DistributedCache.addFileToClassPath(new Path("/myjar/myjar.jar"),
conf);*

Can I ask which version of Hadoop you're using?  Whenever I try to use
addFileToClassPath on 0.20.2+737 it adds the file to the distributed cache
but my mappers and reducers still can't find the classes.  I'm stuck with
handing around a huge fat jar as my job.jar that contains all the
dependencies my mappers and reducers need.  I think this is related to
MAPREDUCE-752, but so far nobody on this list has really tried to give a
real diagnosis.

Mime
View raw message