nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julien Nioche (JIRA)" <>
Subject [jira] Closed: (NUTCH-696) Timeout for Parser
Date Fri, 28 Aug 2009 13:28:59 GMT


Julien Nioche closed NUTCH-696.

    Resolution: Later

> Timeout for Parser
> ------------------
>                 Key: NUTCH-696
>                 URL:
>             Project: Nutch
>          Issue Type: Wish
>          Components: fetcher
>            Reporter: Julien Nioche
>            Priority: Minor
> I found that the parsing sometimes crashes due to a problem on a specific document, which
is a bit of a shame as this blocks the rest of the segment and Hadoop ends up finding that
the node does not respond. I was wondering about whether it would make sense to have a timeout
mechanism for the parsing so that if a document is not parsed after a time t, it is simply
treated as an exception and we can get on with the rest of the process.
> Does that make sense? Where do you think we should implement that, in ParseUtil?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message