lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: IndexDeletionPolicy and optimized indices
Date Wed, 02 Jul 2008 09:29:55 GMT

Well ... that heuristic is not quite general enough, because the  
completion of a merge would also decrease the # files and +1 the  
generation number (if a commit had occurred).

You could check for *.cfs and if there is only one, declare the index  
optimized?  This still isn't always correct because if there is a  
_X_N.del file (pending deletes against the segment) then the index is  
not optimized.

But in general, Lucene's file format can change from release to  
release (it's not an API), so if something changes in the future you  
may have to revisit this heuristic.

Mike

Shalin Shekhar Mangar wrote:

> Hi Michael,
>
> Thanks for the response.
>
> Looking at the general way the filenames are organized:
>
> IndexCommit.getFileNames() without optimize (after IW.close())
> [segments_4, _0.cfs, _1.cfs, _2.cfs]
> IndexCommit.getFileNames() after optimize+close [segments_5, _4.cfs]
>
> We can compare the latest commit point's files with the previous
> commit point's files and if the number of .cfs files have decreased
> (or equal) (with a +1 in generation number), can we reliably say if an
> optimize has happened?
>
> On Tue, Jul 1, 2008 at 5:44 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>>
>> You're right IndexCommit doesn't know that it represents an  
>> optimized index.
>>
>> Likewise, IndexCommit doesn't know other "semantic" things about  
>> the index, eg, you've just called expungeDeletes, or, you just  
>> finished adding batch X of documents to the index, etc.
>>
>> Also, realize that with autoCommit=false (to be the only choice in  
>> 3.0), no commit will be done after an optimize.  Ie you have to  
>> call commit() or close() explicitly to make it a commit.
>>
>> I think the simplest general approach to "know" which commit points  
>> represent "interesting" times to the application would be to call  
>> IW.optimize() then IW.commit() (if you are using trunk) or just  
>> IW.close(), then look at the last IndexCommit passed to your  
>> deletion policy's onCommit() and record yourself that this commit  
>> was the result of an optimize.
>>
>> Mike
>>
>> Shalin Shekhar Mangar wrote:
>>
>>> Hi,
>>>
>>> I'm implementing a custom IndexDeletionPolicy. An IndexCommit object
>>> does not have any information whether it's index is optimized or  
>>> not.
>>> How can a IndexDeletionPolicy know which IndexCommit instances
>>> corresponded to optimized indices?
>>>
>>> --
>>> Regards,
>>> Shalin Shekhar Mangar.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message