lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Falck" <marcus.fa...@observer.se>
Subject Lucene hits.length()
Date Tue, 08 Aug 2006 11:30:40 GMT
I have noticed some strange behavior when searching my lucene index.

 

I'm adding 500.000 docs to an index.

 

MergeFactor = 10

MinMerge = 5000

 

When 49999 have been added ( just before the first 10 * 5000 merge ) the
hits.length() is reporting around 1000 hits for a keyword (which by the
way is around the same count as with 5000 docs added). After the 10*5000
merge the hits.length() returns around 8000 hits, which seems to be a
lot more reasonable. Since I'm adding content in date order ( oldest
first ) I have also tried to sort the hits (newest date first) and
display the top 10 hits.

 

According to that output it seems that the documents are added
correctly.

 

I'm using a multisearcher on top of a RAMDir and an FSDir. Using
Lucene1.4.3

 

Anybody that has any idea about why the hit count is so misleading?

 

/

Regards

Marcus

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message