lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: IndexDeletionPolicy and optimized indices
Date Wed, 02 Jul 2008 17:38:28 GMT

OK I opened this one:

     https://issues.apache.org/jira/browse/LUCENE-1325

Mike

Shalin Shekhar Mangar wrote:

> That's great. Thanks!
>
> On Wed, Jul 2, 2008 at 6:04 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>>
>> OK I think that makes sense.  I'll take it.  I'll add an  
>> isOptimized() to
>> IndexCommit.
>>
>> Mike
>>
>> Shalin Shekhar Mangar wrote:
>>
>>> Ok, so there is no reliable way which can work across releases.
>>> Actually, we are implementing replication feature for Solr  
>>> (SOLR-561)
>>> and we'd like the user to configure a replication/snapshoot on  
>>> commit
>>> or only on optimize. We want to rely on IndexDeletionPolicy to avoid
>>> copying index as snapshot and replicate directly from main index.
>>> Therefore, we want a long term and reliable solution.
>>>
>>> Can we think of having an API to support this?
>>>
>>> On Wed, Jul 2, 2008 at 2:59 PM, Michael McCandless
>>> <lucene@mikemccandless.com> wrote:
>>>>
>>>> Well ... that heuristic is not quite general enough, because the
>>>> completion
>>>> of a merge would also decrease the # files and +1 the generation  
>>>> number
>>>> (if
>>>> a commit had occurred).
>>>>
>>>> You could check for *.cfs and if there is only one, declare the  
>>>> index
>>>> optimized?  This still isn't always correct because if there is a
>>>> _X_N.del
>>>> file (pending deletes against the segment) then the index is not
>>>> optimized.
>>>>
>>>> But in general, Lucene's file format can change from release to  
>>>> release
>>>> (it's not an API), so if something changes in the future you may  
>>>> have to
>>>> revisit this heuristic.
>>>>
>>>> Mike
>>>>
>>>> Shalin Shekhar Mangar wrote:
>>>>
>>>>> Hi Michael,
>>>>>
>>>>> Thanks for the response.
>>>>>
>>>>> Looking at the general way the filenames are organized:
>>>>>
>>>>> IndexCommit.getFileNames() without optimize (after IW.close())
>>>>> [segments_4, _0.cfs, _1.cfs, _2.cfs]
>>>>> IndexCommit.getFileNames() after optimize+close [segments_5,  
>>>>> _4.cfs]
>>>>>
>>>>> We can compare the latest commit point's files with the previous
>>>>> commit point's files and if the number of .cfs files have  
>>>>> decreased
>>>>> (or equal) (with a +1 in generation number), can we reliably say  
>>>>> if an
>>>>> optimize has happened?
>>>>>
>>>>> On Tue, Jul 1, 2008 at 5:44 PM, Michael McCandless
>>>>> <lucene@mikemccandless.com> wrote:
>>>>>>
>>>>>> You're right IndexCommit doesn't know that it represents an  
>>>>>> optimized
>>>>>> index.
>>>>>>
>>>>>> Likewise, IndexCommit doesn't know other "semantic" things  
>>>>>> about the
>>>>>> index, eg, you've just called expungeDeletes, or, you just  
>>>>>> finished
>>>>>> adding
>>>>>> batch X of documents to the index, etc.
>>>>>>
>>>>>> Also, realize that with autoCommit=false (to be the only choice 

>>>>>> in
>>>>>> 3.0),
>>>>>> no commit will be done after an optimize.  Ie you have to call  
>>>>>> commit()
>>>>>> or
>>>>>> close() explicitly to make it a commit.
>>>>>>
>>>>>> I think the simplest general approach to "know" which commit  
>>>>>> points
>>>>>> represent "interesting" times to the application would be to call
>>>>>> IW.optimize() then IW.commit() (if you are using trunk) or just
>>>>>> IW.close(),
>>>>>> then look at the last IndexCommit passed to your deletion  
>>>>>> policy's
>>>>>> onCommit() and record yourself that this commit was the result  
>>>>>> of an
>>>>>> optimize.
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>> Shalin Shekhar Mangar wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm implementing a custom IndexDeletionPolicy. An IndexCommit
 
>>>>>>> object
>>>>>>> does not have any information whether it's index is optimized
 
>>>>>>> or not.
>>>>>>> How can a IndexDeletionPolicy know which IndexCommit instances
>>>>>>> corresponded to optimized indices?
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Shalin Shekhar Mangar.
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user- 
>>>>>>> help@lucene.apache.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Shalin Shekhar Mangar.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Shalin Shekhar Mangar.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
> -- 
> Regards,
> Shalin Shekhar Mangar.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message