lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aurora <auror...@gmail.com>
Subject Re: index size doubled?
Date Tue, 21 Dec 2004 18:05:03 GMT
Thanks for the heads up. I'm using Lucene 1.4.2.

I tried to do optimize() again but it has no effect. Adding a just tiny  
dummy document would get rid of it.

I'm doing optimize every few hundred documents because I tried to simulate  
incremental update. This lead to another question I would post separately.

Thanks.


> Another possibility is that you are using an older version of Lucene,
> which was known to have a bug with similar symptoms.  Get the latest
> version of Lucene.
>
> You shouldn't really have multiple .cfs files after optimizing your
> index.  Also, optimize only at the end, if you care about indexing
> speed.
>
> Otis
>
> --- Paul Elschot <paul.elschot@xs4all.nl> wrote:
>
>> On Tuesday 21 December 2004 05:49, aurora wrote:
>> > I'm testing the rebuilding of the index. I add several hundred
>> documents,
>> > optimize and add another few hundred and so on. Right now I have
>> around
>> > 7000 files. I observed after the index gets to certain size.
>> Everytime
>> > after optimize, the are two files roughly the same size like below:
>> >
>> > 12/20/2004  01:57p                  13 deletable
>> > 12/20/2004  01:57p                  29 segments
>> > 12/20/2004  01:53p          14,460,367 _5qf.cfs
>> > 12/20/2004  01:57p          15,069,013 _5zr.cfs
>> >
>> > The index total index is double of what I expect. This is not
>> always
>> > reproducible. (I'm constantly tuning my program and the set of
>> document).
>> > Sometime I get a decent single document after optimize. What was
>> happening?
>>
>> Lucene tried to delete the older version (_5cf.cfs above), but got an
>> error
>> back from the file system. After that it has put the name of that
>> segment in
>> the deletable file, so it can try later to delete that segment.
>>
>> This is known behaviour on FAT file systems. These randomly take some
>> time
>> for themselves to finish closing a file after it has been correctly
>> closed by
>> a program.
>>
>> Regards,
>> Paul Elschot
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>



-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message