lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin Goodheart-Smithe <colings86....@googlemail.com>
Subject IndexCommit.delete() outside of IndexDeletionPolicy
Date Wed, 06 Jun 2012 11:16:11 GMT
I was looking at the Lucene API for IndexCommit and noticed that the
JavaDoc states that

*'Decision that a commit-point should be deleted is taken by the
IndexDeletionPolicy<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html>
in
effect and therefore this should only be called by its
onInit()<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html#onInit(java.util.List)>
 or onCommit()<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html#onCommit(java.util.List)>
 methods.'*
(
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexCommit.html#delete()
 )

I was wondering why this is the case and whether deleting IndexCommits
outside of a IndexDeletionPolicy is actually a bad idea?

To put some context around this I am looking to implement a deletion policy
which is independant of the IndexWriter commit and more dependant on
Processes using particular Commit points being finished with it.
The logic would look something like the following and state would be stored
in something like ZooKeeper so I can have use of ephremal nodes and watcher
events:

   - IndexWriters would have a NoDeletionPolicy set
   - Each time a process opens a session it registers an ephremal node
   - The session is assigned the current (latest) commit point
   - Each time a process removes the node (either through crashing or
   having finished the job) a watch event is fired where a separate process
   will delete the commit point the process was using if no other processes
   are using the commit point and if it is not the latest commit point

Processes may have fairly long running sessions so across all the processes
a reasonable number of commit points might be in use.  I don't really want
to have to wait for a commit from the IndexWriter (which may not happen for
a while) to clear up the older commit points I no longer need.  Would this
logic pose any issues given that it is going to be deleting Commit points
outside of the IndexDeletionPolicy

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message