lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Optimization and Corruption Issues
Date Thu, 01 Oct 2009 15:56:41 GMT
lowfreq wrote:
> I have a Lucene index that is very large in size. 
> It was created using a pre 2.1 version of Lucene.net 2.0.0.4. 
> 
> The index is currently almost 20 GB, and has almost 7000 segment files. 
> The problem I am having is that I need to optimize it, and cant do this
> without the search functionality of my app being down for a week. 
> 
> I used the Luke tool from getopt.org and it worked flawlessly, optimizing
> the index in just over 2 hours. Problem is that my search cannot use it, and
> the error states Unknown Format Version errors, or just plain nothing found. 

You should be careful when using Lucene Java to modify Lucene.Net 
indexes. I know for a fact that deflated data in Lucene Java is 
incompatible with the deflater implementation in .Net, so it's easy to 
create an incompatible index even when you use a supposedly compatible 
version of Lucene Java. Perhaps versions around 2.0 still worked ok, but 
no guarantees.


> 
> I understand that versions of Lucene that are newer than what the index was
> built and is searched with can cause problems. 
> 
> What can I do to make this work? I have tried older versions of Luke, 0.7
> was the oldest I could lay hands on, but even it uses a newer version of
> Lucene. 

Here are links to older versions of Luke:

	http://www.getopt.org/luke/luke-0.1.zip
	http://www.getopt.org/luke/luke-0.2.zip
	http://www.getopt.org/luke/luke-0.3.zip
	http://www.getopt.org/luke/luke-0.4.zip
	http://www.getopt.org/luke/luke-0.5/luke-0.5.jar
	http://www.getopt.org/luke/luke-0.5/luke-src-0.5.zip
	http://www.getopt.org/luke/luke-0.6/lukeall-0.6.jar
	http://www.getopt.org/luke/luke-0.6/luke-src-0.6.zip


> 
> My index version shows as 633103800023469045. The version the index is
> written as after optimizing with Luke 7.0 is 633103800023469057. 

This is just a timestamp, so it doesn't say what version of Lucene 
created the index. If you open the index with Luke, in the Overview tab 
there is a line that tells what is the index format version.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message