lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lowfreq <>
Subject Re: Optimization and Corruption Issues
Date Thu, 01 Oct 2009 20:26:11 GMT

Thank you very much for the detailed information everyone!
I will try to use the information to make my code better.

I have parsed out the optimization bits into a commandline app that runs the
optimize on another box. Its messy, but effective in keeping downtime to a
minimum. This will get the large amount of segment files under control for
now. Too bad it takes a week or more. Hopefully I will not have to reindex
it anytime soon. 

I think the best way around this is transaction/agent based for the future.
That way, I can keep a read only copy for searching.

My app currently uses two services, one for writes and one for reads.
I suspect that this may be the problem that is causing the corruption.

Does anyone have any experience with this type of setup, and has seen/knows
that this can cause a corrupted lucene index? 

I have heard that having more than one service attached at a time causes the
problem I am seeing.

Thanks for the links to the old Luke distros, and thanks for all the quick


Andrzej Bialecki wrote:
> lowfreq wrote:
>> I have a Lucene index that is very large in size. 
>> It was created using a pre 2.1 version of 
>> The index is currently almost 20 GB, and has almost 7000 segment files. 
>> The problem I am having is that I need to optimize it, and cant do this
>> without the search functionality of my app being down for a week. 
>> I used the Luke tool from and it worked flawlessly, optimizing
>> the index in just over 2 hours. Problem is that my search cannot use it,
>> and
>> the error states Unknown Format Version errors, or just plain nothing
>> found. 
> You should be careful when using Lucene Java to modify Lucene.Net 
> indexes. I know for a fact that deflated data in Lucene Java is 
> incompatible with the deflater implementation in .Net, so it's easy to 
> create an incompatible index even when you use a supposedly compatible 
> version of Lucene Java. Perhaps versions around 2.0 still worked ok, but 
> no guarantees.
>> I understand that versions of Lucene that are newer than what the index
>> was
>> built and is searched with can cause problems. 
>> What can I do to make this work? I have tried older versions of Luke, 0.7
>> was the oldest I could lay hands on, but even it uses a newer version of
>> Lucene. 
> Here are links to older versions of Luke:
>> My index version shows as 633103800023469045. The version the index is
>> written as after optimizing with Luke 7.0 is 633103800023469057. 
> This is just a timestamp, so it doesn't say what version of Lucene 
> created the index. If you open the index with Luke, in the Overview tab 
> there is a line that tells what is the index format version.
> -- 
> Best regards,
> Andrzej Bialecki     <><
>   ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>  Contact: info at sigram dot com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View this message in context:
Sent from the Lucene - Java Developer mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message