hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-964) ClassNotFoundException in ReduceTaskRunner
Date Thu, 01 Feb 2007 20:10:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469568

Doug Cutting commented on HADOOP-964:

> Are we going to be able to move all comparator access int the child process.

Yes, I think so.  That's the subject of HADOOP-968.

> should I change this to occur in TaskTracker?

No.  I was referring to the TaskTracker process, which is wherre ReduceTaskRunner runs.  I
have not yet looked closely at your patch, but it is certainly a candidate for (2), the short-term
fix.  HADOOP-968 is  the long-term fix.

> I will create a unit test for the comparator in the jar file now

Great!  That would be most welcome.  There are already some unit tests that use a job jar
file, so this can probably be bundled into one of those.

I think Owen's planning to review your patch more closely.

> ClassNotFoundException in ReduceTaskRunner
> ------------------------------------------
>                 Key: HADOOP-964
>                 URL: https://issues.apache.org/jira/browse/HADOOP-964
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>         Environment: windows xp and fedora core 6 linux, java 1.5.10...should affect
all systems
>            Reporter: Dennis Kubes
>            Priority: Blocker
>             Fix For: 0.11.0
>         Attachments: classpath.patch, classpath2.path
> In the ReduceTaskRunner constructor lin 339 a sorter is created that attempts to get
the map output key and value classes from the configuration object.  This is before the TaskTracker$Child
process is spawned off into into own separate JVM so here the classpath for the configuration
is the classpath that started the TaskTracker.  The current hadoop script includes the hadoop
jars, meaning that any hadoop writable type will be found, but it doesn't include nutch jars
 so any nutch writable type or any other writable type will not be found and will throw a
> I don't think it is a good idea to have a dependecy on specific Nutch jars in the Hadoop
script but it is a good idea to allow jars to be included if they are in specific locations,
such as the HADOOP_HOME where the nutch jar resides.  I have attached a patch that adds any
jars in the HADOOP_HOME directory to the hadoop classpath.  This fixes the issues with getting
ClassNotFoundExceptions inside of Nutch processes.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message