lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Binda <olivier.bi...@wanadoo.fr>
Subject How to get the Terms from the TermsEnum of an IntField ?
Date Wed, 30 Jul 2014 06:21:25 GMT
Hello.

How do you get the terms for a TermsEnum of an IntField coded with 
precisionStc = Integer.Max that you  get with
MultiFields.getTerms(reader, intField).iterator(null) ?

I had mitigated success trying to get the terms out of this iterator 
with NumericUtils.prefixCodedToInt

I tried

while (true) {
                 BytesRef ref = termsEnum?.next()
                 if (ref == null) break
                 int value = NumericUtils.prefixCodedToInt(ref)
}

But it doesn't work (reliably) because of the trie structure I guess

In an IntField with values 1,2,3,4,5 it worked
But in an Int Field with all values from 1 to 2500, I got exceptions :
lots of shifts aren't in the 0..31 range and it looks like there are 
"Blocks" with :

first a term with shift 0 and value n
followed by lots of terms with shift that aren't in 0..31 but who share 
the same prefix...

Best regards,
Olivier



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message