nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gal Nitzan" <gnit...@usa.net>
Subject RE: Generator.java bug?
Date Fri, 02 Feb 2007 14:17:31 GMT
Hi Andrzej,

Well on my system the list does contains urls and the fetcher does fetch it
correctly, however if I keep that test in the "if" it will report the list
is empty.

I am not sure but maybe the first value is not a FloatWritable or maybe
something else?

Thanks,

Gal



-----Original Message-----
From: Andrzej Bialecki [mailto:ab@getopt.org] 
Sent: Friday, February 02, 2007 3:28 PM
To: nutch-dev@lucene.apache.org
Subject: Re: Generator.java bug?

Gal Nitzan wrote:
> Hi,
>
>  
>
> After many failures of generate "Generator: 0 records selected for
fetching,
> exiting ..." I made a post about it a few days back.
>
>  
>
> I narrowed down to the following function:
>
>  
>
> public Path generate(Path dbDir, Path segments, int numLists, long topN,
> long curTime, boolean filter, boolean force)
>
>  
>
> in the following if:  if (readers == null || readers.length == 0 ||
> !readers[0].next(new FloatWritable()))
>
>  
>
>  
>
> It turns out that the: "!readers[0].next(new FloatWritable())" is the
> culprit.
>   

Well, this condition simply checks if the result is not empty. When we 
open Reader[] on a SequenceFile, each reader corresponds to a 
part-xxxxx. There must be at least one part, so we use the one at index 
0. If we cannot retrieve at least one entry from it, then it logically 
follows that the file is empty, and we bail out.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com





Mime
View raw message