lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
Subject Re: index files naming
Date Mon, 03 Jan 2011 13:22:06 GMT
Hi Simon,

thanks a lot for your good explanation.

Best wishes,
Bernd


Am 03.01.2011 13:51, schrieb Simon Willnauer:
> Hey Bernd,
> 
> On Mon, Jan 3, 2011 at 1:35 PM, Bernd Fehling
> <bernd.fehling@uni-bielefeld.de> wrote:
>> Dear list,
>>
>> some questions about the names of the index files.
>> With an older Lucene/Solr 4.x version from trunk my index looks like:
>> _2t1.fdt
>> _2t1.fdx
>> _2t1.fnm
>> _2t1.frq
>> _2t1.nrm
>> _2t1.prx
>> _2t1.tii
>> _2t1.tis
>> segments_2
>> segments.gen
>>
>> With a most recent version from trunk it looks like:
>> _3a9.fdt
>> _3a9.fdx
>> _3a9.fnm
>> _3a9_0.frq
>> _3a9.nrm
>> _3a9_0.prx
>> _3a9_0.tii
>> _3a9_0.tis
>> segments_4
>> segments.gen
>>
>> Why is there an "_0" at some files?
>> Is it from Lucene or from Solr or a fault in my system?
> 
> lucene 4.0 as you might know has the ability to plug in a Codec which
> has full control over how postings are stored, which format is used
> and what files are written. Each Field within a segment can have its
> own codec ie. field "foo" can have "Standard" and Field "bar" uses
> "Pulsing" for instance. In such a case, since Pulsing is just a
> wrapper around Standard - Codec, both codecs try to write the same set
> of files per segment. For that reason we introduced a codec ID valid
> per segment. Its is really an ordinal build from the set of codecs
> used per segment. this ordinal is used to build the filenames, in your
> case you only have one codec (I suppose its Standard - Codec) with the
> ord "0". This ordinal is used for all files that codec writes. The
> files without that ordinal (nrm, fdt, fdx and fnrm) are written by the
> IndexWriter directly but the functionality behind it might be exposed
> via codec sooner or later.
> 
> So, afterall this is a lucene functionality and your system is just
> fine doing the right thing!
>>
>> I also didn't find any information at
>> http://lucene.apache.org/java/3_0_3/fileformats.html
> 
> Its not a 3.3 feature - codecs are introduced in 4.0 aka. trunk.
> 
> simon
>>
>> Both indexes are optimized, any idea?
>>
>> Regards, Bernd
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message