lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Retrieving payloads for terms matched by a query
Date Fri, 22 May 2009 10:29:47 GMT

On May 22, 2009, at 12:28 AM, Dmitri Bichko wrote:

> Hi,
> I may be missing something obvious, but how do I get the payloads for
> the specific token positions that were matched by a query?

See SpanQuery.getPayloadSpans() and it's SpanQuery derivatives.

> For example, if I have a phrase query like "A keyword B" that matches
> the field "A keyword B A", I can get the payloads for A and B with
> IndexReader.termPositions(), but that doesn't tell me which of the two
> positions of A matched the query.  I can kind of work back to it, but
> it quickly becomes difficult for more complex queries.
> Here's what I'm trying to accomplish: I have several entity classes,
> I'd want to create an index using the class names as tokens, and
> storing the specific entity ids in the payloads.  That way, I should
> be able to run queries in terms of the classes (ie '"CLASS_A
> CLASS_B"~10 NOT CLASS_C', etc) and then report the actual entities for
> the matching documents. (I realize I'll need custom tokenizers/filters
> to identify and tag the entities and handle class references in
> queries, but that part seems pretty straightforward).
> Does this sound workable?
> Thanks,
> Dmitri
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message