lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: SpanQuery for Terms at same position
Date Thu, 26 Nov 2009 00:49:15 GMT
Hmmm, are they unit tests? Or would you be wiling to create stand-alone
unit tests demonstrating this and submit it as a patch?

Best
Erick@AlwaysTrollingForWorkFromOthers.Opportunistic.

On Wed, Nov 25, 2009 at 5:38 PM, Christopher Tignor <ctignor@thinkmap.com>wrote:

> my own tests with my own data show you are correct and the 1-n slop works
> for matching terms at the same ordinal position.
>
> thanks!
>
> C>T>
>
> On Wed, Nov 25, 2009 at 4:25 PM, Paul Elschot <paul.elschot@xs4all.nl
> >wrote:
>
> > Op woensdag 25 november 2009 21:20:33 schreef Christopher Tignor:
> > > It's worth noting however that this -1 slop doesn't seem to work for
> > cases
> > > where oyu want to discover instances of more than two terms at the same
> > > position.  Would be nice to be able to explicitly set this in the query
> > > construction.
> >
> > I think requiring n terms at the same position would need a slop of 1-n,
> > and I'd like to have some test cases added for that.
> > Now if I only had some time...
> >
> > Regards,
> > Paul Elschot
> >
> > >
> > > thanks,
> > >
> > > C>T>
> > > On Tue, Nov 24, 2009 at 9:17 AM, Christopher Tignor <
> > ctignor@thinkmap.com>wrote:
> > >
> > > > yes that indeed works for me.
> > > >
> > > > thanks,
> > > >
> > > > C>T>
> > > >
> > > >
> > > > On Mon, Nov 23, 2009 at 5:50 PM, Paul Elschot <
> paul.elschot@xs4all.nl
> > >wrote:
> > > >
> > > >> Op maandag 23 november 2009 20:07:58 schreef Christopher Tignor:
> > > >> > Also, I noticed that with the above edit to NearSpansOrdered
I am
> > > >> getting
> > > >> > erroneous results fo normal ordered searches using searches like:
> > > >> >
> > > >> > "_n" followed by "work"
> > > >> >
> > > >> > where because "_n" and "work" are at the same position the code
> > changes
> > > >> > accept their pairing as a valid in-order result now that the
eqaul
> > to
> > > >> clause
> > > >> > has been added to the inequality.
> > > >>
> > > >> Thanks for trying this. Indeed the "followed by" semantics is broken
> > for
> > > >> the ordered case when spans at the same positions are considered
> > > >> ordered.
> > > >>
> > > >> Did I understand correctly that the unordered case with a slop of
-1
> > > >> and without the edit works to match terms at the same position?
> > > >> In that case it may be worthwhile to add that to the javadocs,
> > > >> and also add a few testcases.
> > > >>
> > > >> Regards,
> > > >> Paul Elschot
> > > >>
> > > >> >
> > > >> > C>T>
> > > >> >
> > > >> > On Mon, Nov 23, 2009 at 12:26 PM, Christopher Tignor
> > > >> > <ctignor@thinkmap.com>wrote:
> > > >> >
> > > >> > > Thanks so much for this.
> > > >> > >
> > > >> > > Using an un-ordered query, the -1 slop indeed returns the
> correct
> > > >> results,
> > > >> > > matching tokens at the same position.
> > > >> > >
> > > >> > > I tried the same query but ordered both after and before
> > rebuilding
> > > >> the
> > > >> > > source with Paul's changes to NearSpansOrdered but the query
was
> > still
> > > >> > > failing, returning no results.
> > > >> > >
> > > >> > > C>T>
> > > >> > >
> > > >> > >
> > > >> > > On Mon, Nov 23, 2009 at 11:59 AM, Mark Miller <
> > markrmiller@gmail.com
> > > >> >wrote:
> > > >> > >
> > > >> > >> Your trying -1 with ordered right? Try it with non ordered.
> > > >> > >>
> > > >> > >> Christopher Tignor wrote:
> > > >> > >> > A slop of -1 doesn't work either.  I get no results
returned.
> > > >> > >> >
> > > >> > >> > this would be a *really* helpful feature for me
if someone
> > might
> > > >> suggest
> > > >> > >> an
> > > >> > >> > implementation as I would really like to be able
to do
> > arbitrary
> > > >> span
> > > >> > >> > searches where tokens may be at the same position
and also in
> > other
> > > >> > >> > positions where the ordering of subsequent terms
may be
> > restricted
> > > >> as
> > > >> > >> per
> > > >> > >> > the normal span API.
> > > >> > >> >
> > > >> > >> > thanks,
> > > >> > >> >
> > > >> > >> > C>T>
> > > >> > >> >
> > > >> > >> > On Sun, Nov 22, 2009 at 7:50 AM, Paul Elschot <
> > > >> paul.elschot@xs4all.nl
> > > >> > >> >wrote:
> > > >> > >> >
> > > >> > >> >
> > > >> > >> >> Op zondag 22 november 2009 04:47:50 schreef
Adriano
> Crestani:
> > > >> > >> >>
> > > >> > >> >>> Hi,
> > > >> > >> >>>
> > > >> > >> >>> I didn't test, but you might want to try
SpanNearQuery and
> > set
> > > >> slop to
> > > >> > >> >>>
> > > >> > >> >> zero.
> > > >> > >> >>
> > > >> > >> >>> Give it a try and let me know if it worked.
> > > >> > >> >>>
> > > >> > >> >> The slop is the number of positions "in between",
so zero
> > would
> > > >> still
> > > >> > >> be
> > > >> > >> >> too
> > > >> > >> >> much to only match at the same position.
> > > >> > >> >>
> > > >> > >> >> SpanNearQuery may or may not work for a slop
of -1, but one
> > could
> > > >> try
> > > >> > >> >> that for both the ordered and unordered cases.
> > > >> > >> >> One way to do that is to start from the existing
test cases.
> > > >> > >> >>
> > > >> > >> >> Regards,
> > > >> > >> >> Paul Elschot
> > > >> > >> >>
> > > >> > >> >>
> > > >> > >> >>> Regards,
> > > >> > >> >>> Adriano Crestani
> > > >> > >> >>>
> > > >> > >> >>> On Thu, Nov 19, 2009 at 7:28 PM, Christopher
Tignor <
> > > >> > >> >>>
> > > >> > >> >> ctignor@thinkmap.com>wrote:
> > > >> > >> >>
> > > >> > >> >>>> Hello,
> > > >> > >> >>>>
> > > >> > >> >>>> I would like to search for all documents
that contain both
> > > >> "plan" and
> > > >> > >> >>>>
> > > >> > >> >> "_v"
> > > >> > >> >>
> > > >> > >> >>>> (my part of speech token for verb)
at the same position.
> > > >> > >> >>>> I have tokenized the documents accordingly
so these tokens
> > > >> exists at
> > > >> > >> >>>>
> > > >> > >> >> the
> > > >> > >> >>
> > > >> > >> >>>> same location.
> > > >> > >> >>>>
> > > >> > >> >>>> I can achieve programaticaly using
PhraseQueries by adding
> > the
> > > >> Terms
> > > >> > >> >>>> explicitly at the same position but
I need to be able to
> > recover
> > > >> the
> > > >> > >> >>>> Payload
> > > >> > >> >>>> data for each
> > > >> > >> >>>> term found within the matched instance
of my query.
> > > >> > >> >>>>
> > > >> > >> >>>> Unfortunately the PayloadSpanUtil doesn't
seem to return
> the
> > > >> same
> > > >> > >> >>>>
> > > >> > >> >> results
> > > >> > >> >>
> > > >> > >> >>>> as
> > > >> > >> >>>> the PhraseQuery, possibly becuase it
is converting it
> inoto
> > > >> Spans
> > > >> > >> first
> > > >> > >> >>>> which do not support searching for
Terms at the same
> > document
> > > >> > >> position?
> > > >> > >> >>>>
> > > >> > >> >>>> Any help appreciated.
> > > >> > >> >>>>
> > > >> > >> >>>> thanks,
> > > >> > >> >>>>
> > > >> > >> >>>> C>T>
> > > >> > >> >>>>
> > > >> > >> >>>> --
> > > >> > >> >>>> TH!NKMAP
> > > >> > >> >>>>
> > > >> > >> >>>> Christopher Tignor | Senior Software
Architect
> > > >> > >> >>>> 155 Spring Street NY, NY 10012
> > > >> > >> >>>> p.212-285-8600 x385 f.212-285-8999
> > > >> > >> >>>>
> > > >> > >> >>>>
> > > >> > >> >>
> > > >>
> ---------------------------------------------------------------------
> > > >> > >> >> To unsubscribe, e-mail:
> > java-user-unsubscribe@lucene.apache.org
> > > >> > >> >> For additional commands, e-mail:
> > java-user-help@lucene.apache.org
> > > >> > >> >>
> > > >> > >> >>
> > > >> > >> >>
> > > >> > >> >
> > > >> > >> >
> > > >> > >> >
> > > >> > >>
> > > >> > >>
> > > >> > >>
> > ---------------------------------------------------------------------
> > > >> > >> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> > > >> > >> For additional commands, e-mail:
> > java-user-help@lucene.apache.org
> > > >> > >>
> > > >> > >>
> > > >> > >
> > > >> > >
> > > >> > > --
> > > >> > > TH!NKMAP
> > > >> > >
> > > >> > > Christopher Tignor | Senior Software Architect
> > > >> > > 155 Spring Street NY, NY 10012
> > > >> > > p.212-285-8600 x385 f.212-285-8999
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > TH!NKMAP
> > > >> >
> > > >> > Christopher Tignor | Senior Software Architect
> > > >> > 155 Spring Street NY, NY 10012
> > > >> > p.212-285-8600 x385 f.212-285-8999
> > > >> >
> > > >>
> > > >>
> ---------------------------------------------------------------------
> > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > TH!NKMAP
> > > >
> > > > Christopher Tignor | Senior Software Architect
> > > > 155 Spring Street NY, NY 10012
> > > > p.212-285-8600 x385 f.212-285-8999
> > > >
> > >
> > >
> > >
> > > --
> > > TH!NKMAP
> > >
> > > Christopher Tignor | Senior Software Architect
> > > 155 Spring Street NY, NY 10012
> > > p.212-285-8600 x385 f.212-285-8999
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
> --
> TH!NKMAP
>
> Christopher Tignor | Senior Software Architect
> 155 Spring Street NY, NY 10012
> p.212-285-8600 x385 f.212-285-8999
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message