lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject File format Qs
Date Tue, 16 Aug 2005 12:10:03 GMT
Greets,

First clarification:  In a Lucene string, it appears that the VInt at  
the head counts bytes, not UTF8 characters... correct?

Next, this document...

http://lucene.apache.org/java/docs/fileformats.html

... seems to indicate ndicates that Format (the first number written  
to the 'segments' file) is a UInt32:

"Format, SegCount, SegSize --> UInt32"

However, it's -1, so it can't be an unsigned 32-bit integer.

Spelunking through SegmentInfos.java in 1.4.3, it looks like Format  
is a big-endian twos-complement 32-bit integer.

I think I see some other documentation glitches in the 1.4.3 source,  
but since I'm looking at an old release, I probably ought to hold off  
on those.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message