lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: How to configure lucene 4.x to read 3.x index files
Date Wed, 24 Sep 2014 01:18:56 GMT
As reported in the issue, since 4.8 we do better checks when reading
this stuff in.

Unfortunately, 3.0-3.3 indexes had bugs in the way they encode the
deleted documents.

So for those indexes, we have to ignore the trailing garbage at the
end of the file.

On Tue, Sep 23, 2014 at 9:15 PM, Patrick Mi <patrick.mi@touchpoint.co.nz> wrote:
> Hi Robert/Uwe,
>
> I have tried v4.8 and v4.9 - not working either.
>
> V4.7.0, V4.7.1, v4.7.2 are good.
>
> Regards,
> Patrick
>
> -----Original Message-----
> From: Patrick Mi [mailto:patrick.mi@touchpoint.co.nz]
> Sent: Wednesday, 24 September 2014 12:24 p.m.
> To: 'java-user@lucene.apache.org'
> Subject: RE: How to configure lucene 4.x to read 3.x index files
>
> Hi Robert/Uwe,
>
> Thanks very much for the quick response.
>
> I have tried again with a different set of index(28k documents) generated
> from V3 too and that worked.
>
> But the one(30k documents) I tried indeed worked for the V3 but not V4.10.
> Maybe something in that index could cause problem in V4 but not v3.
>
> Also I have tried an earlier version v4.7 as Uwe suggested and V4.7 version
> works on the V3 index that V4.10 failed to open.
>
> Regards,
>
> Patrick
>
>
>
> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Tuesday, 23 September 2014 11:52 p.m.
> To: java-user
> Subject: Re: How to configure lucene 4.x to read 3.x index files
>
> You should not have to configure anything.
>
> The exception should not happen: can I have this index to debug the issue?
>
> On Mon, Sep 22, 2014 at 11:07 PM, Patrick Mi
> <patrick.mi@touchpoint.co.nz> wrote:
>> Hi there,
>>
>> I understood that Lucene V4 could read 3.x index files by configuring
>> Lucene3xCodec but what exactly needs to be done here?
>>
>> I used DEMO code from V4.10.0 to generate v4 index files and could read
>> them
>> without problem. When I tried to read index files generated from V3 I got
>> the following errors:
>>
>> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
>> did not read all bytes from file: read 65 vs size 66 (resource:
>> BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
>>         at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
>>         at
>> org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
>>         at
>> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
>>         at
>> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
>>         at
>> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>>         at
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
>>         at
>> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
>>         at
>> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
>>         at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)
>>
>> My classpath includes the following jars from V4:
>> lucene-core-4.10.0.jar
>> lucene-analyzers-common-4.10.0.jar
>> lucene-queries-4.10.0.jar
>> lucene-queryparser-4.10.0.jar
>> lucene-facet-4.10.0.jar
>> lucene-expressions-4.10.0.jar
>>
>> Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
>> lucene-core-4.10.0.jar) contains the following lines:
>> org.apache.lucene.codecs.lucene40.Lucene40Codec
>> org.apache.lucene.codecs.lucene3x.Lucene3xCodec
>> org.apache.lucene.codecs.lucene41.Lucene41Codec
>> org.apache.lucene.codecs.lucene42.Lucene42Codec
>> org.apache.lucene.codecs.lucene45.Lucene45Codec
>> org.apache.lucene.codecs.lucene46.Lucene46Codec
>> org.apache.lucene.codecs.lucene49.Lucene49Codec
>> org.apache.lucene.codecs.lucene410.Lucene410Codec
>>
>> Does that mean Lucene3xCodec will be picked up automatically based on the
>> index files itself?
>>
>> Where is the API I could force the code to use V3 setting? IndexReader and
>> IndexSearcher don’t seem to have anywhere I can pass that in?
>>
>> Did some search but couldn't find the useful resources covered that. Much
>> appreciated if someone could point out the right direction.
>>
>> Regards,
>> Patrick
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message