lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhang, Lisheng" <Lisheng.Zh...@BroadVision.com>
Subject RE: Lucene indexed data corruption error
Date Sat, 30 Jun 2012 21:39:04 GMT
Hi Ume,

I read your blog, only issue is that we are using java 1.6 as indicated below,
have we ever heard of similar issue in java 1.6 ?

Two facts:
1) we are using gluster to replicate data into another folder (so that we have
a back up for fault tolerance), the replication is going on continuously. 

2) I ran another time just now (again starting from an empty folder), somehow 
it works this time, there are some minor data changes (we select data from RDB 
for indexing), but I think the possibility that this error is due to special 
kind of RDB data is very small.

Thanks very much for helps, Lisheng


-----Original Message-----
From: Zhang, Lisheng [mailto:Lisheng.Zhang@broadvision.com]
Sent: Saturday, June 30, 2012 2:17 PM
To: java-user@lucene.apache.org
Subject: RE: Lucene indexed data corruption error


Thanks for such a quick help!

The java we use is:
java -version
java version "1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.13) (6b20-1.9.13-0ubuntu1~10.04.1)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)

Best regards, Lisheng

-----Original Message-----
From: Uwe Schindler [mailto:uwe@thetaphi.de]
Sent: Saturday, June 30, 2012 1:52 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene indexed data corruption error


What JVM are you using? This looks like one of the Vint bugs we found in recent Oracle Java
versions, where we have workarounds since Lucene 3.1. See my blog post about the Java 7 bugs,
too, they are closely related: blog.thetaphi.de
--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de



"Zhang, Lisheng" <Lisheng.Zhang@BroadVision.com> schrieb:

Hi,

We have been using lucene 2.3.2 for years well (yes, we should upgrade).

Recently we encountered data corruption error when commiting IndexWriter:

///
background merge hit exception: _14b:c61262 _1ag:c11225 _1gb:c9411 _1gv:c905 _1gw:c50 _1gx:c50
_1gy:c50 _1gz:c50 _1h0:c31 into _1h1 [optimize]
java.io.IOException: background merge hit exception: _14b:c61262 _1ag:c11225 _1gb:c9411 _1gv:c905
_1gw:c50 _1gx:c50 _1gy:c50 _1gz:c50 _1h0:c31 into _1h1 [optimize]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1787)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1727)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1707)
///

Then we use CheckIndex tool to analyze and found one segment (out of 13) having problem:

///
test: stored fields.......ERROR [field data are in wrong format: java.util.zip.DataFormatException:
unknown compression method]
org.apache.lucene.index.CorruptIndexException: field data are in wrong format: java.util.zip.DataFormatException:
unknown compression method
at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:605)
at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:392)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:259)
at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:934)
at org.apache.lucene.index.IndexReader.document(IndexReader.java:844)
at org.apache.lucene.index.CheckIndex.testStoredFields(CheckIndex.java:702)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:517)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:898)
///

Our stored fields are very simple (just id and short title, more fields are only for search).

Our data size is about 400MB and 83K documents. We started indexing from an 
empty folder. Also we have been using lucene 2.3.2 for years and this is the 1st
time to encounter this issue?

Indexer is running in a linux box, "uname -a" returns:
Linux <our box name> 2.6.32-342-ec2 #43-Ubuntu SMP Wed Jan 4 18:22:42 UTC 2012 x86_64
GNU/Linux

We really appreciate any guidance,

Lisheng


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
View raw message