lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Swift <bryan.j.sw...@gmail.com>
Subject Re: Bytes seem the same but checksums differ
Date Sat, 15 Aug 2009 05:17:44 GMT
Right, so after looking at what was happening in SegmentInfos again I
noticed I was saving to the Datastore on IndexOutput.flush but not on close.
Persisting the file on close solved this particular problem.
Sorry about that.

On Sat, Aug 15, 2009 at 1:03 AM, Bryan Swift <bryan.j.swift@gmail.com>wrote:

> I'm attempting to create a Directory implementation (lucene-core 2.4.1) to
> sit on top of Google's App Engine Datastore (written in Scala). In the
> process of doing this I found something odd for which I'm hoping there is a
> relatively simple solution.
> When instantiating a new IndexWriter with my Directory implementation
> (which uses Datastore based IndexInput and IndexOutput classes) the checksum
> of the segments file (segments_1 because there is nothing in the index yet)
> varies when calculated by ChecksumIndexOutput vs ChecksumIndexInput. Of
> course, this causes a CorruptIndexException to be thrown at line 248. The
> interesting thing is the array of bytes being written by my
> DatastoreIndexOutput is the same array of bytes being ready by
> DatastoreIndexInput. I've also noticed the difference between the checksums
> is consistently that checksumThen (line 246 in SegmentInfos) is one less
> than checksumNow (line 245 in SegmentInfos).
>
> In an attempt to gain further information about this problem I added CRC32
> objects to my IndexInput and IndexOutput definitions to in order to peak in
> on their values while debugging and it seems the IndexInput and IndexOutput
> classes I defined have the same checksum after reading and writing all the
> bits.
>
> The source for my implementation to this point can be found at
> http://github.com/bryanjswift/quotidian/tree/search-checksums under
> src/main/scala/quotidian/search
>
> Any insight or assistance would be very much appreciated at this point.
>
> Cheers.
> Bryan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message