Jason Rutherglen wrote:
> For Ocean I created a workaround where the IndexCommits from
> IndexDeletionPolicy are saved in a map in order to achieve deleting
> based on the IndexReader. It would be more straightforward to
> delete from the IndexCommit in IndexReader.
It seems like we are mixing up deleting a whole commit point, vs
deleting individual documents? Or does Ocean somehow decide to delete
a whole commit point based on which documents have been deleted?
> I realize people want to get away from IndexReader performing
> updates, however, for my use case, realtime search updating from
> IndexReader makes sense mainly for obtaining the doc ids of
> deletions. With IndexWriter managing the merges it would seem
> difficult to expose doc numbers, but perhaps there is a way.
IndexWriter can now delete by query, but it sounds like that's not
sufficient for Ocean?
Under the hood, IndexWriter has the infrastructure to hold pending
deleted docIDs and update these docIDs when a merge is committed. Ie,
previously we forced a flush of all pending deletes on every flush/
merge, but now we buffer the docIDs across flushes/merges. This means
IndexWriter *could* delete by docID, however, none of this is exposed
publicly.
Also, this doesn't solve the problem of how you would get the docIDs
to delete in the first place (ie one must still use a separate
IndexReader for that).
I'm not sure this helps you (Ocean) since you presumably need to flush
deletes very quickly to have realtime search...
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
|