lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Swift <bryan.j.sw...@gmail.com>
Subject Bytes seem the same but checksums differ
Date Sat, 15 Aug 2009 05:03:20 GMT
I'm attempting to create a Directory implementation (lucene-core 2.4.1) to
sit on top of Google's App Engine Datastore (written in Scala). In the
process of doing this I found something odd for which I'm hoping there is a
relatively simple solution.
When instantiating a new IndexWriter with my Directory implementation (which
uses Datastore based IndexInput and IndexOutput classes) the checksum of the
segments file (segments_1 because there is nothing in the index yet) varies
when calculated by ChecksumIndexOutput vs ChecksumIndexInput. Of course,
this causes a CorruptIndexException to be thrown at line 248. The
interesting thing is the array of bytes being written by my
DatastoreIndexOutput is the same array of bytes being ready by
DatastoreIndexInput. I've also noticed the difference between the checksums
is consistently that checksumThen (line 246 in SegmentInfos) is one less
than checksumNow (line 245 in SegmentInfos).

In an attempt to gain further information about this problem I added CRC32
objects to my IndexInput and IndexOutput definitions to in order to peak in
on their values while debugging and it seems the IndexInput and IndexOutput
classes I defined have the same checksum after reading and writing all the
bits.

The source for my implementation to this point can be found at
http://github.com/bryanjswift/quotidian/tree/search-checksums under
src/main/scala/quotidian/search

Any insight or assistance would be very much appreciated at this point.

Cheers.
Bryan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message