lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Payloads and PhraseQuery
Date Fri, 13 Jul 2007 01:13:06 GMT

On Jul 12, 2007, at 6:12 PM, Chris Hostetter wrote:

> Hmm... okay so the issue is that in order to get the payload data, you
> have to have a TermPositions instance.
> instead of adding getPayload methods to the Spans class (which as Paul
> points out, can have nesting issues) perhaps more general solutions  
> would
> be:
> a) a more high level getPayload API that let's you get a payload
> arbitrarily for a toc/position (perhaps as part of the TernDocs  
> API?) ...
> then for Spans you could use this new API with Spans.start() and
> Spans.end(). (and all the positions in between)

Not sure I follow this.  I don't see the fit w/ TermDocs.
> b) add a variation of the TermPositions class to allow people to  
> iterate
> through the terms of a TermDoc in position order (TermPosition first
> iterates over the Terms and then over the positions) ... then you  
> could
> seek(span.start()) to get the Payload data
> c) add methods to the Spans API to get the subspans (if any) ... this
> would be the Spans corrilary to getTerms() and would always return
> TermSpans which would have TermPositions for getting payload data.

This could be a good alternative.

When we first talked about payloads we wondered if we could just make  
all Queries into SpanQueries by passing TermPositions instead of term  
docs, but in the end decided not to do it because of performance  
issues (some of which are lessened by lazy loading of TermPositions.

The thing is, I think, that the Spans is already moving you along in  
the term positions, so it just seems like a natural fit to have it  
there, even if there is nesting.  It doesn't seem like it would be  
that hard to then return back the nesting stuff b/c you are just  
collating the results from the underlying SpanTermQuery.  Having said  
that, I haven't looked into the actual code, so take that w/ a grain  
of salt.

I will try to do some more investigation, as others are welcome to  
do.  Perhaps we should move this to dev?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message