lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Searching while optimizing
Date Tue, 24 Nov 2009 20:39:41 GMT
OK, I'll add that to the javadocs; thanks.

But the fact that you weren't closing the old readers was probably
also tying up lots of disk space...


On Tue, Nov 24, 2009 at 3:31 PM, vsevel <> wrote:
> Hi, this is good information. as I read your post I realized that I am
> supposed to commit after an optimize, which is something I do not currently
> do. That would probably lead to the extra disk space I saw being consumed.
> If this is correct, then the optimize javadoc could be improved to say that
> it needs to be followed by a commit or close, like any other write.
> thanks for the help,
> vincent
> Michael McCandless-2 wrote:
>> On Tue, Nov 24, 2009 at 9:08 AM, vsevel <> wrote:
>>> Hi, just to make sure I understand correctly... After an optimize,
>>> without
>>> any reader, my index takes 30Gb on the disk. Are you saying that if I can
>>> ensure there is only one reader at a time, it could take up to 120Gb on
>>> the
>>> disk if searching while an optimize is going on?
>>> I did not get your 3X when there is no reader. In that situation isn't
>>> that
>>> the nominal size?
>> If before optimizing your index takes 30 GB, then you open a writer,
>> and start the optimize and wait for it to finish, it can take up to 90
>> GB.  Once the optimize is done, but before you commit, 60 GB will be
>> in use.  Once you commit/close this will drop to 30 GB.
>> (These are all worst-case numbers -- in practice, an optimized index
>> is smaller, sometimes by alot eg if there are many pending deletions,
>> than the original).
>> If the reader was already opened before you opened the writer, then
>> there's no change to disk space requirements (because the reader has
>> opened a commit (the starting commit) that the writer will not delete,
>> anyway).
>> But if you open a new reader while the optimize is underway, it's
>> possible to require total 120 GB of space (30 GB for your index, 90 GB
>> transient), because the reader is holding open temporary segments that
>> the writer wants to delete.  If you open more than one reader, and
>> don't close the old ones, you can tie up even more disk space.
>>> different subject: I saw in 3.0.0RC1 that interrupting a merging thread
>>> was
>>> being discussed. couldn't you do something similar for searches. I let my
>>> users do full text searches on documents with over 50 fields. if using
>>> too
>>> many wildcards, the search could take a long time. and rather than
>>> restricting what they can do, I would rather let them cancel the search
>>> gracefully. would that be something feasible?
>> IndexWriter is interruptible via Thread.interrupt(), but searching
>> currently is not.  However, TimeLimitingCollector can be used to set a
>> timeout for searches.
>> Mike
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> --
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message