lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Falck" <marcus.fa...@observer.se>
Subject SV: addIndexes method without the merge
Date Wed, 16 Aug 2006 08:57:31 GMT


-----Ursprungligt meddelande-----
Från: Marcus Falck [mailto:marcus.falck@observer.se] 
Skickat: den 16 augusti 2006 10:47
Till: java-user@lucene.apache.org
Ämne: addIndexes method without the merge

Hi,

 

In my search engine (based on top of the lucene 1.4.3 api) I'm using one
RAMDir as a live indexing buffert and one FSDir as the main persisted
index.

 

When the RAMDir buffert has been filled I'm adding those documents to
the FSDir and clear the RAMDir. 

 

At first I was iterating thru the RAMDir and added every document to the
FSDir. But that turned out to be very inefficient. So I tried to use the
addIndexes method but soon I realized that the addIndexes method will
always leave me with one optimized file ( as long as the maxMergeDocs
hasn't been reached ??).  Then I tried to do my own
addIndexesWithoutOptimize method which main purpose is to ALWAYS create
a new segment file for the directory added to it (and still merge at the mergeFactor). At
first this method
seemed to work but after a while I realized that this was not the case,
my method seems to do some minor screw up to the index.

 

So I'm wondering if somebody could help my do a new method?

Example 1:
Adding 50000 docs using the addIndexes
After 5000:
One 20 MB file
After 10000:
One 40 MB file.
After 15000:
One 60 MB file.

And so on... (very inefficient).

Example 2:
Adding 50000 docs using my addIndexesWithoutMerge
After 5000:
One 20 MB file.
After 10000:
2 x 20 MB files.
After 50000:
1 x 200 MB file.

Example 2 is how I want it. How can I accomplish this?
 
 

/

Regards

Marcus 

 




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message