lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: IndexCommit.delete() outside of IndexDeletionPolicy
Date Wed, 06 Jun 2012 11:37:21 GMT
I think this use case makes sense; such logic (for a distributed / ref
counted deletion policy) would make a nice contribution ... it's the
"proper" way to delete commits when multiple nodes are in use (vs eg
using a timeout deletion policy).

You can actually do it today: call IndexWriter.deleteUnusedFiles.
That visits the deletion policy and then you have a chance to delete
commit points (it'd mean you have to set a real deletion policy on the
writer, which in turn goes and checks the reference counts across all
nodes).

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jun 6, 2012 at 7:16 AM, Colin Goodheart-Smithe
<colings86.dev@googlemail.com> wrote:
> I was looking at the Lucene API for IndexCommit and noticed that the
> JavaDoc states that
>
> *'Decision that a commit-point should be deleted is taken by the
> IndexDeletionPolicy<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html>
> in
> effect and therefore this should only be called by its
> onInit()<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html#onInit(java.util.List)>
>  or onCommit()<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html#onCommit(java.util.List)>
>  methods.'*
> (
> http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexCommit.html#delete()
>  )
>
> I was wondering why this is the case and whether deleting IndexCommits
> outside of a IndexDeletionPolicy is actually a bad idea?
>
> To put some context around this I am looking to implement a deletion policy
> which is independant of the IndexWriter commit and more dependant on
> Processes using particular Commit points being finished with it.
> The logic would look something like the following and state would be stored
> in something like ZooKeeper so I can have use of ephremal nodes and watcher
> events:
>
>   - IndexWriters would have a NoDeletionPolicy set
>   - Each time a process opens a session it registers an ephremal node
>   - The session is assigned the current (latest) commit point
>   - Each time a process removes the node (either through crashing or
>   having finished the job) a watch event is fired where a separate process
>   will delete the commit point the process was using if no other processes
>   are using the commit point and if it is not the latest commit point
>
> Processes may have fairly long running sessions so across all the processes
> a reasonable number of commit points might be in use.  I don't really want
> to have to wait for a commit from the IndexWriter (which may not happen for
> a while) to clear up the older commit points I no longer need.  Would this
> logic pose any issues given that it is going to be deleting Commit points
> outside of the IndexDeletionPolicy

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message