On 6/5/06, Erick Erickson <erickerickson@gmail.com> wrote:
>
> A few thoughts...
>
> 1> are you sure you only indexed the document once? If you indexed the
> same
> data multiple times, you'll have duplicate documents, each of which will
> have a different Lucene ID (i.e. doc()).
Yes.. but I will make sure again.
2> have you examined your index with, say, Luke? I've found that a wonderful
> tool for seeing if the data I *thought* was in my index was actually
> there.
Database is too huge.. I will need some time to go through it to look if I
did any mistake while creating indexes..
3> when you say "the same document", how do you know that? The internal
> Lucene ID or some field you've put in the index? This really as another
> form
> of "are you sure you indexed the data once?" because the internal Lucene
> id
> is what you get back from hits.doc(). If you're getting multiple entries
> like that, then I'm lost.
Same document means.. same path of the document say .. same URL miltiple
times.. well its a good point.. I will check if they all have same docIDs..
thanks for your suggestion.
|