hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Hsieh <victorhs...@gmail.com>
Subject Re: Add user jars to mapreduce
Date Wed, 20 Jan 2010 13:36:36 GMT
BTW, this issue has been reported:
http://issues.apache.org/jira/browse/MAPREDUCE-752

On Wed, Jan 20, 2010 at 7:59 PM, Victor Hsieh <victorhsieh@gmail.com> wrote:
> Hi Eirc,
>
> (I was new to this mailing list, so I don't have the original email to
> reply directly.)
>
> I have exact the same problem today, and finally found the reason.
>
> In our case, we add some URI to DistributedCache like you.  But
> unfortunately the problem was the URI.  When we tried to add several
> jars by calling addFileToClassPath, these files are actually joined by
> colons, which is the default path separator in java classpath.  And
> this is the reason of failure.
>
> For example, if you have hdfs://example.com:9000/a.jar and
> hdfs://example.com:9000/b.jar to add to classpath, your
> mapred.job.classpath.files will look like (note these colons!):
>
>  dfs://example.com:9000/a.jar:hdfs://example.com:9000/b.jar
>
> Then when a worker tries to add them to the classpath (search
> getFileClassPaths in org.apache.hadoop.mapred.TaskRunner.java), it
> actually adds "dfs", "//example.com", "9000/a.jar", and so on, which
> is not desired.
>
> Our solution is to remove "hdfs://example.com:9000" part when calling
> addFileToClassPath.  Hope it helps!
>
> Victor
>

Mime
View raw message