lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: SpanQuery for Terms at same position
Date Wed, 25 Nov 2009 21:25:04 GMT
Op woensdag 25 november 2009 21:20:33 schreef Christopher Tignor:
> It's worth noting however that this -1 slop doesn't seem to work for cases
> where oyu want to discover instances of more than two terms at the same
> position.  Would be nice to be able to explicitly set this in the query
> construction.

I think requiring n terms at the same position would need a slop of 1-n,
and I'd like to have some test cases added for that.
Now if I only had some time...

Regards,
Paul Elschot

> 
> thanks,
> 
> C>T>
> On Tue, Nov 24, 2009 at 9:17 AM, Christopher Tignor <ctignor@thinkmap.com>wrote:
> 
> > yes that indeed works for me.
> >
> > thanks,
> >
> > C>T>
> >
> >
> > On Mon, Nov 23, 2009 at 5:50 PM, Paul Elschot <paul.elschot@xs4all.nl>wrote:
> >
> >> Op maandag 23 november 2009 20:07:58 schreef Christopher Tignor:
> >> > Also, I noticed that with the above edit to NearSpansOrdered I am
> >> getting
> >> > erroneous results fo normal ordered searches using searches like:
> >> >
> >> > "_n" followed by "work"
> >> >
> >> > where because "_n" and "work" are at the same position the code changes
> >> > accept their pairing as a valid in-order result now that the eqaul to
> >> clause
> >> > has been added to the inequality.
> >>
> >> Thanks for trying this. Indeed the "followed by" semantics is broken for
> >> the ordered case when spans at the same positions are considered
> >> ordered.
> >>
> >> Did I understand correctly that the unordered case with a slop of -1
> >> and without the edit works to match terms at the same position?
> >> In that case it may be worthwhile to add that to the javadocs,
> >> and also add a few testcases.
> >>
> >> Regards,
> >> Paul Elschot
> >>
> >> >
> >> > C>T>
> >> >
> >> > On Mon, Nov 23, 2009 at 12:26 PM, Christopher Tignor
> >> > <ctignor@thinkmap.com>wrote:
> >> >
> >> > > Thanks so much for this.
> >> > >
> >> > > Using an un-ordered query, the -1 slop indeed returns the correct
> >> results,
> >> > > matching tokens at the same position.
> >> > >
> >> > > I tried the same query but ordered both after and before rebuilding
> >> the
> >> > > source with Paul's changes to NearSpansOrdered but the query was still
> >> > > failing, returning no results.
> >> > >
> >> > > C>T>
> >> > >
> >> > >
> >> > > On Mon, Nov 23, 2009 at 11:59 AM, Mark Miller <markrmiller@gmail.com
> >> >wrote:
> >> > >
> >> > >> Your trying -1 with ordered right? Try it with non ordered.
> >> > >>
> >> > >> Christopher Tignor wrote:
> >> > >> > A slop of -1 doesn't work either.  I get no results returned.
> >> > >> >
> >> > >> > this would be a *really* helpful feature for me if someone
might
> >> suggest
> >> > >> an
> >> > >> > implementation as I would really like to be able to do arbitrary
> >> span
> >> > >> > searches where tokens may be at the same position and also
in other
> >> > >> > positions where the ordering of subsequent terms may be restricted
> >> as
> >> > >> per
> >> > >> > the normal span API.
> >> > >> >
> >> > >> > thanks,
> >> > >> >
> >> > >> > C>T>
> >> > >> >
> >> > >> > On Sun, Nov 22, 2009 at 7:50 AM, Paul Elschot <
> >> paul.elschot@xs4all.nl
> >> > >> >wrote:
> >> > >> >
> >> > >> >
> >> > >> >> Op zondag 22 november 2009 04:47:50 schreef Adriano Crestani:
> >> > >> >>
> >> > >> >>> Hi,
> >> > >> >>>
> >> > >> >>> I didn't test, but you might want to try SpanNearQuery
and set
> >> slop to
> >> > >> >>>
> >> > >> >> zero.
> >> > >> >>
> >> > >> >>> Give it a try and let me know if it worked.
> >> > >> >>>
> >> > >> >> The slop is the number of positions "in between", so
zero would
> >> still
> >> > >> be
> >> > >> >> too
> >> > >> >> much to only match at the same position.
> >> > >> >>
> >> > >> >> SpanNearQuery may or may not work for a slop of -1, but
one could
> >> try
> >> > >> >> that for both the ordered and unordered cases.
> >> > >> >> One way to do that is to start from the existing test
cases.
> >> > >> >>
> >> > >> >> Regards,
> >> > >> >> Paul Elschot
> >> > >> >>
> >> > >> >>
> >> > >> >>> Regards,
> >> > >> >>> Adriano Crestani
> >> > >> >>>
> >> > >> >>> On Thu, Nov 19, 2009 at 7:28 PM, Christopher Tignor
<
> >> > >> >>>
> >> > >> >> ctignor@thinkmap.com>wrote:
> >> > >> >>
> >> > >> >>>> Hello,
> >> > >> >>>>
> >> > >> >>>> I would like to search for all documents that
contain both
> >> "plan" and
> >> > >> >>>>
> >> > >> >> "_v"
> >> > >> >>
> >> > >> >>>> (my part of speech token for verb) at the same
position.
> >> > >> >>>> I have tokenized the documents accordingly so
these tokens
> >> exists at
> >> > >> >>>>
> >> > >> >> the
> >> > >> >>
> >> > >> >>>> same location.
> >> > >> >>>>
> >> > >> >>>> I can achieve programaticaly using PhraseQueries
by adding the
> >> Terms
> >> > >> >>>> explicitly at the same position but I need to
be able to recover
> >> the
> >> > >> >>>> Payload
> >> > >> >>>> data for each
> >> > >> >>>> term found within the matched instance of my
query.
> >> > >> >>>>
> >> > >> >>>> Unfortunately the PayloadSpanUtil doesn't seem
to return the
> >> same
> >> > >> >>>>
> >> > >> >> results
> >> > >> >>
> >> > >> >>>> as
> >> > >> >>>> the PhraseQuery, possibly becuase it is converting
it inoto
> >> Spans
> >> > >> first
> >> > >> >>>> which do not support searching for Terms at the
same document
> >> > >> position?
> >> > >> >>>>
> >> > >> >>>> Any help appreciated.
> >> > >> >>>>
> >> > >> >>>> thanks,
> >> > >> >>>>
> >> > >> >>>> C>T>
> >> > >> >>>>
> >> > >> >>>> --
> >> > >> >>>> TH!NKMAP
> >> > >> >>>>
> >> > >> >>>> Christopher Tignor | Senior Software Architect
> >> > >> >>>> 155 Spring Street NY, NY 10012
> >> > >> >>>> p.212-285-8600 x385 f.212-285-8999
> >> > >> >>>>
> >> > >> >>>>
> >> > >> >>
> >> ---------------------------------------------------------------------
> >> > >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> > >> >>
> >> > >> >>
> >> > >> >>
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >>
> >> > >>
> >> > >> ---------------------------------------------------------------------
> >> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> > >>
> >> > >>
> >> > >
> >> > >
> >> > > --
> >> > > TH!NKMAP
> >> > >
> >> > > Christopher Tignor | Senior Software Architect
> >> > > 155 Spring Street NY, NY 10012
> >> > > p.212-285-8600 x385 f.212-285-8999
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > TH!NKMAP
> >> >
> >> > Christopher Tignor | Senior Software Architect
> >> > 155 Spring Street NY, NY 10012
> >> > p.212-285-8600 x385 f.212-285-8999
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
> > --
> > TH!NKMAP
> >
> > Christopher Tignor | Senior Software Architect
> > 155 Spring Street NY, NY 10012
> > p.212-285-8600 x385 f.212-285-8999
> >
> 
> 
> 
> -- 
> TH!NKMAP
> 
> Christopher Tignor | Senior Software Architect
> 155 Spring Street NY, NY 10012
> p.212-285-8600 x385 f.212-285-8999
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message