Strange. What happens without -topN ?
On Thursday 08 December 2011 03:50:20 Rafael Pappert wrote:
> Hello List,
>
> my CrawlDb contains a few urls:
>
> nutch readdb crawl/crawldb -stats
> CrawlDb statistics start: crawl/crawldb
> Statistics for CrawlDb: crawl/crawldb
> TOTAL urls: 1832
> retry 0: 1832
> min score: 1.0
> avg score: 1.0
> max score: 1.0
> status 1 (db_unfetched): 1832
> CrawlDb statistics: done
>
> but the generator always return "0 records selected" even with the
> -noFilter -noNorm Parameter?
>
> nutch generate crawl/crawldb crawl/segments -topN 100 -noNorm -noFilter
> Generator: starting at 2011-12-08 03:37:20
> Generator: Selecting best-scoring urls due for fetch.
> Generator: filtering: false
> Generator: normalizing: false
> Generator: topN: 100
> Generator: 0 records selected for fetching, exiting …
>
> What prevents the generator from selecting urls for fetching?
>
> Any hints?
>
> Greets,
> Rafael.
--
Markus Jelsma - CTO - Openindex
|