hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Gonzalez (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1907) nutch doesnt run under 0.20.2+228-1~karmic-cdh3b1 version of hadoop
Date Thu, 01 Jul 2010 19:32:50 GMT
nutch doesnt run under 0.20.2+228-1~karmic-cdh3b1 version of hadoop

                 Key: MAPREDUCE-1907
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1907
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tasktracker
    Affects Versions: 0.20.2
         Environment: Linux  2.6.31-14-server #48-Ubuntu SMP Fri Oct 16 15:07:34 UTC 2009
x86_64 GNU/Linux
            Reporter: Robert Gonzalez

new versions of hadoop appear to put jars in a different format now, instead of file:/a/b/c/d/job.jar,
its now jar:file:/a/b/c/d/job.jar!, which breaks nutch when its trying to load its plugins.
Specifically, the stack trace looks like:

Caused by: java.lang.RuntimeException: x point org.apache.nutch.net.URLNormalizer not found.
at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:124)
at org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:57)

A simple test class was written the used the URLFilters class, and the following stack trace

10/07/01 14:25:25 INFO mapred.JobClient: Task Id : attempt_201006171624_46525_m_000000_1,
Status : FAILED
java.lang.RuntimeException: org.apache.nutch.net.URLFilter not found.
at org.apache.nutch.net.URLFilters.<init>(URLFilters.java:52)
at com.maxpoint.crawl.BidSampler$BIdSMapper.setup(BidSampler.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

Running this on an older version of hadoop works.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message