lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vsevel <v.se...@lombardodier.com>
Subject Re: Searching while optimizing
Date Tue, 24 Nov 2009 20:31:27 GMT

Hi, this is good information. as I read your post I realized that I am
supposed to commit after an optimize, which is something I do not currently
do. That would probably lead to the extra disk space I saw being consumed.
If this is correct, then the optimize javadoc could be improved to say that
it needs to be followed by a commit or close, like any other write. 
thanks for the help,
vincent


Michael McCandless-2 wrote:
> 
> On Tue, Nov 24, 2009 at 9:08 AM, vsevel <v.sevel@lombardodier.com> wrote:
>> Hi, just to make sure I understand correctly... After an optimize,
>> without
>> any reader, my index takes 30Gb on the disk. Are you saying that if I can
>> ensure there is only one reader at a time, it could take up to 120Gb on
>> the
>> disk if searching while an optimize is going on?
>>
>> I did not get your 3X when there is no reader. In that situation isn't
>> that
>> the nominal size?
> 
> If before optimizing your index takes 30 GB, then you open a writer,
> and start the optimize and wait for it to finish, it can take up to 90
> GB.  Once the optimize is done, but before you commit, 60 GB will be
> in use.  Once you commit/close this will drop to 30 GB.
> 
> (These are all worst-case numbers -- in practice, an optimized index
> is smaller, sometimes by alot eg if there are many pending deletions,
> than the original).
> 
> If the reader was already opened before you opened the writer, then
> there's no change to disk space requirements (because the reader has
> opened a commit (the starting commit) that the writer will not delete,
> anyway).
> 
> But if you open a new reader while the optimize is underway, it's
> possible to require total 120 GB of space (30 GB for your index, 90 GB
> transient), because the reader is holding open temporary segments that
> the writer wants to delete.  If you open more than one reader, and
> don't close the old ones, you can tie up even more disk space.
> 
>> different subject: I saw in 3.0.0RC1 that interrupting a merging thread
>> was
>> being discussed. couldn't you do something similar for searches. I let my
>> users do full text searches on documents with over 50 fields. if using
>> too
>> many wildcards, the search could take a long time. and rather than
>> restricting what they can do, I would rather let them cancel the search
>> gracefully. would that be something feasible?
> 
> IndexWriter is interruptible via Thread.interrupt(), but searching
> currently is not.  However, TimeLimitingCollector can be used to set a
> timeout for searches.
> 
> Mike
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Searching-while-optimizing-tp26485138p26502131.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message