lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karolina Bernat <karolina.ber...@googlemail.com>
Subject Token position vs. token offset - how to bring them together?
Date Fri, 28 Jan 2011 15:41:26 GMT
Hello,

since I moved on with my offset-info problem in HTML files, I got a new one
trying to bring the tokens positions information together with tokens/term
offset information. Can someone tell me, how can I get a token, if I know
its position? It would be nice to get the tokens position from the Token
class, but I could only get the positionIncrement, which is not really
helpful..

What I'm actually trying to do, is to find the offset information of a
span/phrase query. I know, that the contrib highligter can highlight phrase
queries, but I want/need to do it one my own (or rather give the information
to another application, that does the highlighting of my documents). I also
couldn't really understand, how does the highlighter recognize, that the
individual tokens/terms belong to the phrase (i.e. if I search for "peter
pan" at the moment I also get the tokens 'peter' and 'pan' as weighted
terms, also if they occur individually).

Thanks so much in advance!
Karolina

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message