lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Re: Opposite of SpanFirstQuery - Searching for documents by last term in a field
Date Wed, 14 Dec 2016 05:02:22 GMT
On Wed, Dec 12, 2012 at 3:04 AM, Ian Lea <ian.lea@gmail.com> wrote:
> The javadoc for SpanFirstQuery says it is a special case of
> SpanPositionRangeQuery so maybe you can use the latter directly,
> although you might need to know the position of the last term which
> might be a problem.
>
> Alternatives might include reversing the terms and using SpanFirst or
> adding a special "thisistheend" token to each field and using
> SpanNearQuery for dog and thisistheend with suitable value for slop
> and inOrder = true.
>
> Or take the last term and index it in a separate field so you can just
> search for lastterm: dog.

Idly wondering whether anyone has figured out a good way yet in the
time elapsed since last asked.

Here's my problems with the existing ideas:

1. (Using SpanPositionRangeQuery) I am not really sure how to get the
position of the last term.

2. (Using a special token) Adding a token to every document skews term
statistics and requires manually filtering it out of term listings.
Additionally it ruins certain wildcard queries like field:* since now
every field will match.

3. (Indexing the last term(s) in a separate field) In our case we
don't know how far from the end of the content the user will enter
into the query. They might write:

  term w/10 end-of-content
  term w/1000 end-of-content
  ...

Other ideas:

4. Storing all the content twice initially seems to be a potential
solution, but starts looking very hard once you combine queries. For
instance, what about this:

  (term w/10 start-of-content) w/30 (another-term w/10 end-of-content)

5. Put a payload the last term and then _somehow_ (I have no idea how
payload queries work yet) use payload queries to do spans from that.


Is there any good solution to this that people have already figured
out? Is there another SpanPositionCheckQuery subclass that could be
written which somehow fetches the last position in the document from
the acceptPosition method?

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message