hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Hsieh <victorhs...@gmail.com>
Subject Re: Add user jars to mapreduce
Date Wed, 20 Jan 2010 11:59:47 GMT
Hi Eirc,

(I was new to this mailing list, so I don't have the original email to
reply directly.)

I have exact the same problem today, and finally found the reason.

In our case, we add some URI to DistributedCache like you.  But
unfortunately the problem was the URI.  When we tried to add several
jars by calling addFileToClassPath, these files are actually joined by
colons, which is the default path separator in java classpath.  And
this is the reason of failure.

For example, if you have hdfs://example.com:9000/a.jar and
hdfs://example.com:9000/b.jar to add to classpath, your
mapred.job.classpath.files will look like (note these colons!):

  dfs://example.com:9000/a.jar:hdfs://example.com:9000/b.jar

Then when a worker tries to add them to the classpath (search
getFileClassPaths in org.apache.hadoop.mapred.TaskRunner.java), it
actually adds "dfs", "//example.com", "9000/a.jar", and so on, which
is not desired.

Our solution is to remove "hdfs://example.com:9000" part when calling
addFileToClassPath.  Hope it helps!

Victor

Mime
View raw message