lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-325) [PATCH] new method expungeDeleted() added to IndexWriter
Date Sat, 09 Feb 2008 15:22:07 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless updated LUCENE-325:
--------------------------------------

    Attachment: LUCENE-325.patch

Attached patch.  All tests pass.  I plan to commit in a day or two.

This adds two methods to IndexWriter:

  expungeDeletes() -- defaults to doWait=true
  expungeDeletes(boolean doWait)

If doWait is false, and you have a MergeScheduler that runs merges in
BG threads, then the call returns immediately.

I extended MergePolicy so it decides what "expunge deletes" really
means (findMergesToExpungeDeletes).  Then, in LogMergePolicy, I
implemented this policy: we merge all adjacent segments (up to
mergeFactor at once) that have deletes.  If only 1 segment has
deletes, it's a singular merge.


> [PATCH] new method expungeDeleted() added to IndexWriter
> --------------------------------------------------------
>
>                 Key: LUCENE-325
>                 URL: https://issues.apache.org/jira/browse/LUCENE-325
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: CVS Nightly - Specify date in submission
>         Environment: Operating System: Windows XP
> Platform: All
>            Reporter: John Wang
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: attachment.txt, IndexWriter.patch, IndexWriter.patch, LUCENE-325.patch,
TestExpungeDeleted.java
>
>
> We make use the docIDs in lucene. I need a way to compact the docIDs in segments
> to remove the "holes" created from doing deletes. The only way to do this is by
> calling IndexWriter.optimize(). This is a very heavy call, for the cases where
> the index is large but with very small number of deleted docs, calling optimize
> is not practical.
> I need a new method: expungeDeleted(), which finds all the segments that have
> delete documents and merge only those segments.
> I have implemented this method and have discussed with Otis about submitting a
> patch. I don't see where I can attached the patch. I will do according to the
> patch guidleine and email the lucene mailing list.
> Thanks
> -John
> I don't see a place where I can

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message