lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Tignor <ctig...@thinkmap.com>
Subject Re: SpanQuery for Terms at same position
Date Mon, 23 Nov 2009 14:20:47 GMT
Tested it out.  It doesn't work.  A slop of zero indicates no words between
the provided terms.  E.g. my query of "plan" "_n" returns entries like
"contingency "plan".

My work around for this problem is to use a PhraseQuery, where you can
explicitly set Terms to occur at the same location, t orecover the desired
document ids. Then, because I need the payload data for each match, I create
a SpanTermQuery for all the individual terms used, use a modified version
PayloadSpanUtil to recover only the PayloadSpans for each query from the
document ids collected above and then find the intersection of all these
sets making sure to factor in where the each span starts (the end will just
be one ordinal value after) within each document to ensure they're at the
same position.

Definitely more work than it needs to be I think.  Still looking for another
way.

C>T>


On Sat, Nov 21, 2009 at 10:47 PM, Adriano Crestani <
adrianocrestani@gmail.com> wrote:

> Hi,
>
> I didn't test, but you might want to try SpanNearQuery and set slop to
> zero.
> Give it a try and let me know if it worked.
>
> Regards,
> Adriano Crestani
>
> On Thu, Nov 19, 2009 at 7:28 PM, Christopher Tignor <ctignor@thinkmap.com
> >wrote:
>
> > Hello,
> >
> > I would like to search for all documents that contain both "plan" and
> "_v"
> > (my part of speech token for verb) at the same position.
> > I have tokenized the documents accordingly so these tokens exists at the
> > same location.
> >
> > I can achieve programaticaly using PhraseQueries by adding the Terms
> > explicitly at the same position but I need to be able to recover the
> > Payload
> > data for each
> > term found within the matched instance of my query.
> >
> > Unfortunately the PayloadSpanUtil doesn't seem to return the same results
> > as
> > the PhraseQuery, possibly becuase it is converting it inoto Spans first
> > which do not support searching for Terms at the same document position?
> >
> > Any help appreciated.
> >
> > thanks,
> >
> > C>T>
> >
> > --
> > TH!NKMAP
> >
> > Christopher Tignor | Senior Software Architect
> > 155 Spring Street NY, NY 10012
> > p.212-285-8600 x385 f.212-285-8999
> >
>



-- 
TH!NKMAP

Christopher Tignor | Senior Software Architect
155 Spring Street NY, NY 10012
p.212-285-8600 x385 f.212-285-8999

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message