nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Armel Nene (JIRA)" <j...@apache.org>
Subject [jira] Commented: (NUTCH-437) MapFile in Hadoop Trunk has changed, must update references
Date Wed, 14 Feb 2007 12:17:05 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473064
] 

Armel Nene commented on NUTCH-437:
----------------------------------

I was wondering if this patch could fix my problem which is, if not the same, very similar
to this one. I am using Nutch 0.8.2-dev, I have made checkout awhile ago from SVN but never
updated again. I was able to crawl 10000 xml files before with no error whatsoever. This is
the following errors that I get when I'm fetching:

INFO parser.custom: Custom-parse: Parsing content file:/C:/TeamBinder/AddressBook/9100/(65)E110_ST
A0 (1).pdf
07/02/12 22:09:16 INFO fetcher.Fetcher: fetch of file:/C:/TeamBinder/AddressBook/9100/(65)E110_ST
A0 (1).pdf failed with: java.lang.NullPointerException
07/02/12 22:09:17 INFO mapred.LocalJobRunner: 0 pages, 0 errors, 0.0 pages/s, 0 kb/s,
07/02/12 22:09:17 FATAL fetcher.Fetcher: java.lang.NullPointerException
07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:198)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:189)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.hadoop.mapred.MapTask$2.collect(MapTask.java:91)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:314)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:232)
07/02/12 22:09:17 FATAL fetcher.Fetcher: fetcher caught:java.lang.NullPointerException

One of the problem is that my hadoop version says the following: hadoop-0.4.0-patched. Now
I don't know if it means that I am running the 0.4.0 version but it seems a little bit confusing.
Once you can clarify that for me, then I will be able to apply the patch to my version. 

Best Regards,

Armel


> MapFile in Hadoop Trunk has changed, must update references
> -----------------------------------------------------------
>
>                 Key: NUTCH-437
>                 URL: https://issues.apache.org/jira/browse/NUTCH-437
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 0.8.2, 0.9.0
>         Environment: windows xp and java
>            Reporter: Dennis Kubes
>         Assigned To: Andrzej Bialecki 
>             Fix For: 0.8.2, 0.9.0
>
>         Attachments: nutch-hadoop-0.10.2-mapfile.patch
>
>
> The MapFile.Writer signature has changed in hadoop trunk (version 10.x +) to include
a Configuration object.  Object in the Nutch codebase that reference MapFile.Writer will need
to be updated.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message