lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From saisantoshi <saisantosh...@gmail.com>
Subject Re: IndexWriterConfig.OpenMode.CREATE vs OpenMode.APPEND (index files)
Date Fri, 01 Feb 2013 17:42:38 GMT
>>Are you closing or committing your IndexWriter after each added
document?  Because if you add 100 docs you should not see 100 versions
of these files, only one set of files in the end (many docs are
written to one segment). 

No. What I meant to say here is if 100 users have updated the document
separately, then it would produce more files in the index (with the current
version of the Lucene, 4.0). My question here is , why it is not overwriting
the existing index files by just updating it.

I have reverted back my code (2.4 base) and tried to update a document
couple of items, it does produce only a minimal set of files. One thing, I
have mention here is I was using the optimize() method which might be the
reason of keeping some number of files.

Both are using compound file structure format:

2.4   (lets say if I update the document 5 times separately).
====
_5.cfs
segments.gen
segments_5

with 4.0
======== ( the same number of updates to the same document, it produces more
files).. Unfortunately, I have to comment out the optimize() method as this
has been removed in 4.0: Please let me know if there is any other
alternative to it.

_0.cfe
_0.cfs
_0.si
_4.cfe
_4.cfs
_4.si
segments.gen
segments_5

When compared to 2.4, it is using way too many files. If I add couple more
new documents, it creates additional files.

May be its the optimize() method is doing the trick in 2.4.. How can we use
this method in 4.0? 

BTW,  I am using following book, the book only addresses ( 3.0 specific), I
have read the lucene index format section and it seems to be consistent with
2.4 stuff but not 4.0. 



Thanks,
Sai.






--
View this message in context: http://lucene.472066.n3.nabble.com/IndexWriterConfig-OpenMode-CREATE-vs-OpenMode-APPEND-index-files-tp4037766p4038007.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message