lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Payloads and PhraseQuery
Date Thu, 12 Jul 2007 22:12:00 GMT

: That is off of the TermSpans class.  BTQ (BoostingTermQuery) is
	...
: I am not completely sure here, but it seems like we may need an
: efficient way to access the TermPositions for each document.  That
: is, the Spans class doesn't provide this and maybe it should
	...
: > I'm looking for Spans.getPositions(), as shown in
	...
: >> : I'm now looking at using payloads with SpanNearQuery but I don't
: >> see any
: >> : clear way of getting the payload(s) from the matching span
: >> terms. The

Hmm... okay so the issue is that in order to get the payload data, you
have to have a TermPositions instance.

instead of adding getPayload methods to the Spans class (which as Paul
points out, can have nesting issues) perhaps more general solutions would
be:

a) a more high level getPayload API that let's you get a payload
arbitrarily for a toc/position (perhaps as part of the TernDocs API?) ...
then for Spans you could use this new API with Spans.start() and
Spans.end(). (and all the positions in between)

b) add a variation of the TermPositions class to allow people to iterate
through the terms of a TermDoc in position order (TermPosition first
iterates over the Terms and then over the positions) ... then you could
seek(span.start()) to get the Payload data

c) add methods to the Spans API to get the subspans (if any) ... this
would be the Spans corrilary to getTerms() and would always return
TermSpans which would have TermPositions for getting payload data.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message