lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Soboroff <ian.sobor...@nist.gov>
Subject Re: Reconstruct segments file?
Date Mon, 07 Feb 2005 18:13:55 GMT
Doug Cutting <cutting@apache.org> writes:

> Ian Soboroff wrote:
>> I've looked over the file formats web page, and poked at a known-good
>> segments file from a separate, similar index using od(1) and such.  I
>> guess what I'm not sure how to do is to recover the SegSize from the
>> segment I have.
>
> The SegSize should be the same as the length in bytes of any of the 
> .f[0-9]+ files in the segment.  If your segment is in compound format 
> then you can use IndexReader.main() in the current SVN version to list 
> the files and sizes in the .cfs file, including its contained .f[0-9]+ 
> files.

Thanks, Doug, that is a huge help.

BTW, the fileformats.html page on the Lucene web site is incorrect
with regards to the segments file.  The description should read:

Segments --> Format, Version, Counter, SegCount, 
             <SegName, SegSize>^SegCount

That is, the Counter field is missing.  The Counter field is a UInt32.
Counter is used to generate the next segment name (see
IndexWriter.newSegmentName()).

Speaking of Counter, I have a dumb question.  If the segments are
named using an integer counter which is incremented, what is the point
in converting that counter into a string for the segment filename?
Why not just name the segments e.g. "1.frq", etc.?

Ian




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message