lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wj <ppm10...@gmail.com>
Subject A doubt of the description about the tii file format in document
Date Fri, 06 Jul 2012 02:16:58 GMT
IN http://lucene.apache.org/core/3_6_0/fileformats.html#tii

1 tii structure
The structure of this file is very similar to the .tis file, with the
addition of one item per record, the IndexDelta.

TermInfoIndex (.tii)--> TIVersion, IndexTermCount, IndexInterval,
SkipInterval, MaxSkipLevels, TermIndices

TIVersion --> UInt32

IndexTermCount --> UInt64

IndexInterval --> UInt32

SkipInterval --> UInt32

TermIndices --> <TermInfo, IndexDelta> IndexTermCount

IndexDelta --> VLong


2 tis structure

TermInfoFile (.tis)--> TIVersion, TermCount, IndexInterval,
SkipInterval, MaxSkipLevels, TermInfos

TIVersion --> UInt32

TermCount --> UInt64

IndexInterval --> UInt32

SkipInterval --> UInt32

MaxSkipLevels --> UInt32

TermInfos --> <TermInfo> TermCount

TermInfo --> <Term, DocFreq, FreqDelta, ProxDelta, SkipDelta>

Term --> <PrefixLength, Suffix, FieldNum>

Suffix --> String

PrefixLength, DocFreq, FreqDelta, ProxDelta, SkipDelta
--> VInt


My doubt is:the TermInfo structure in TII file is as same as TermInfo in TIS ?

---------------------------------------------THE TIS  HEX
FF FF FF FC 00 00 00 00  00 00 00 12 00 00 00 80
00 00 00 10 00 00 00 0A
00 02 6D 79 00 01 00 00
00 08 73 74 6F 72 65 79  65 73 00 01 02 04 00 04
74 65 73 74 00 01 01 01  00 02 6D 79 01 01 01 01
00 07 73 74 6F 72 65 6E  6F 01 01 01 01 00 04 74
65 73 74 01 01 01 01 00  04 64 6F 63 31 02 01 01
01 00 02 6D 79 02 01 01  01 00 08 73 74 6F 72 65
79 65 73 02 01 01 01 00  04 74 65 73 74 02 01 01
01 00 04 64 6F 63 32 03  01 01 01 00 02 6D 79 03
01 01 01 00 08 73 74 6F  72 65 79 65 73 03 01 01
01 00 04 74 65 73 74 03  01 01 01 00 04 64 6F 63
32 04 01 01 01 00 02 6D  79 04 01 01 01 00 07 73
74 6F 72 65 6E 6F 04 01  01 01 00 04 74 65 73 74
04 01 01 01

One of the TermInfo in TIS file
00 :PrefixLength
02 :string length
6D 79 :Term“my” unicode code
00 :filed num
01 :term in only one doc
00 :FreqDelta,determines the position of this term's TermFreqs within
the .frq file.
00:ProxDelta,determines the position of this term's TermPositions
within the .prx file.


--------------------------------------------------------THE TII HEX
FF FF FF FC 00 00 00 00  00 00 00 01 00 00 00 80
00 00 00 10 00 00 00 0A  00 00 FF FF FF FF 0F 00
00 00 18

FF FF FF FC :TIVersion
00 00 00 00  00 00 00 01 :IndexTermCount
00 00 00 80:IndexInterval
00 00 00 10 :SkipInterval
00 00 00 0A  00 00 FF FF FF FF 0F 00 00 00 18

BUT what is the TermInfo in TII ?  It confused me,please give me a help,thx.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message