nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From muraliweb <>
Subject Re: Nutch crawl does not capture pages of lower depth
Date Thu, 03 Sep 2009 08:29:07 GMT

Managed to find out the problem.
The property indexer.max.tokens in nutch-default.xml was causing the top
level pages to be skipped.
After changing the value to something like 30000, the crawler was able to
pick up all the pages as per the configured depth.

muraliweb wrote:
> Nutch crawl does not pick up pages at depth 1 and 2 when its configured
> for depth 3.
> When the crawl is configured at depth 2 it does not pickup the homepage.
> Can anyone please help
> thanks in advance
> murali

View this message in context:
Sent from the Nutch - User mailing list archive at

View raw message