lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Tignor <ctig...@thinkmap.com>
Subject Re: SpanQuery for Terms at same position
Date Wed, 25 Nov 2009 22:38:55 GMT
my own tests with my own data show you are correct and the 1-n slop works
for matching terms at the same ordinal position.

thanks!

C>T>

On Wed, Nov 25, 2009 at 4:25 PM, Paul Elschot <paul.elschot@xs4all.nl>wrote:

> Op woensdag 25 november 2009 21:20:33 schreef Christopher Tignor:
> > It's worth noting however that this -1 slop doesn't seem to work for
> cases
> > where oyu want to discover instances of more than two terms at the same
> > position.  Would be nice to be able to explicitly set this in the query
> > construction.
>
> I think requiring n terms at the same position would need a slop of 1-n,
> and I'd like to have some test cases added for that.
> Now if I only had some time...
>
> Regards,
> Paul Elschot
>
> >
> > thanks,
> >
> > C>T>
> > On Tue, Nov 24, 2009 at 9:17 AM, Christopher Tignor <
> ctignor@thinkmap.com>wrote:
> >
> > > yes that indeed works for me.
> > >
> > > thanks,
> > >
> > > C>T>
> > >
> > >
> > > On Mon, Nov 23, 2009 at 5:50 PM, Paul Elschot <paul.elschot@xs4all.nl
> >wrote:
> > >
> > >> Op maandag 23 november 2009 20:07:58 schreef Christopher Tignor:
> > >> > Also, I noticed that with the above edit to NearSpansOrdered I am
> > >> getting
> > >> > erroneous results fo normal ordered searches using searches like:
> > >> >
> > >> > "_n" followed by "work"
> > >> >
> > >> > where because "_n" and "work" are at the same position the code
> changes
> > >> > accept their pairing as a valid in-order result now that the eqaul
> to
> > >> clause
> > >> > has been added to the inequality.
> > >>
> > >> Thanks for trying this. Indeed the "followed by" semantics is broken
> for
> > >> the ordered case when spans at the same positions are considered
> > >> ordered.
> > >>
> > >> Did I understand correctly that the unordered case with a slop of -1
> > >> and without the edit works to match terms at the same position?
> > >> In that case it may be worthwhile to add that to the javadocs,
> > >> and also add a few testcases.
> > >>
> > >> Regards,
> > >> Paul Elschot
> > >>
> > >> >
> > >> > C>T>
> > >> >
> > >> > On Mon, Nov 23, 2009 at 12:26 PM, Christopher Tignor
> > >> > <ctignor@thinkmap.com>wrote:
> > >> >
> > >> > > Thanks so much for this.
> > >> > >
> > >> > > Using an un-ordered query, the -1 slop indeed returns the correct
> > >> results,
> > >> > > matching tokens at the same position.
> > >> > >
> > >> > > I tried the same query but ordered both after and before
> rebuilding
> > >> the
> > >> > > source with Paul's changes to NearSpansOrdered but the query
was
> still
> > >> > > failing, returning no results.
> > >> > >
> > >> > > C>T>
> > >> > >
> > >> > >
> > >> > > On Mon, Nov 23, 2009 at 11:59 AM, Mark Miller <
> markrmiller@gmail.com
> > >> >wrote:
> > >> > >
> > >> > >> Your trying -1 with ordered right? Try it with non ordered.
> > >> > >>
> > >> > >> Christopher Tignor wrote:
> > >> > >> > A slop of -1 doesn't work either.  I get no results
returned.
> > >> > >> >
> > >> > >> > this would be a *really* helpful feature for me if someone
> might
> > >> suggest
> > >> > >> an
> > >> > >> > implementation as I would really like to be able to
do
> arbitrary
> > >> span
> > >> > >> > searches where tokens may be at the same position and
also in
> other
> > >> > >> > positions where the ordering of subsequent terms may
be
> restricted
> > >> as
> > >> > >> per
> > >> > >> > the normal span API.
> > >> > >> >
> > >> > >> > thanks,
> > >> > >> >
> > >> > >> > C>T>
> > >> > >> >
> > >> > >> > On Sun, Nov 22, 2009 at 7:50 AM, Paul Elschot <
> > >> paul.elschot@xs4all.nl
> > >> > >> >wrote:
> > >> > >> >
> > >> > >> >
> > >> > >> >> Op zondag 22 november 2009 04:47:50 schreef Adriano
Crestani:
> > >> > >> >>
> > >> > >> >>> Hi,
> > >> > >> >>>
> > >> > >> >>> I didn't test, but you might want to try SpanNearQuery
and
> set
> > >> slop to
> > >> > >> >>>
> > >> > >> >> zero.
> > >> > >> >>
> > >> > >> >>> Give it a try and let me know if it worked.
> > >> > >> >>>
> > >> > >> >> The slop is the number of positions "in between",
so zero
> would
> > >> still
> > >> > >> be
> > >> > >> >> too
> > >> > >> >> much to only match at the same position.
> > >> > >> >>
> > >> > >> >> SpanNearQuery may or may not work for a slop of
-1, but one
> could
> > >> try
> > >> > >> >> that for both the ordered and unordered cases.
> > >> > >> >> One way to do that is to start from the existing
test cases.
> > >> > >> >>
> > >> > >> >> Regards,
> > >> > >> >> Paul Elschot
> > >> > >> >>
> > >> > >> >>
> > >> > >> >>> Regards,
> > >> > >> >>> Adriano Crestani
> > >> > >> >>>
> > >> > >> >>> On Thu, Nov 19, 2009 at 7:28 PM, Christopher
Tignor <
> > >> > >> >>>
> > >> > >> >> ctignor@thinkmap.com>wrote:
> > >> > >> >>
> > >> > >> >>>> Hello,
> > >> > >> >>>>
> > >> > >> >>>> I would like to search for all documents
that contain both
> > >> "plan" and
> > >> > >> >>>>
> > >> > >> >> "_v"
> > >> > >> >>
> > >> > >> >>>> (my part of speech token for verb) at the
same position.
> > >> > >> >>>> I have tokenized the documents accordingly
so these tokens
> > >> exists at
> > >> > >> >>>>
> > >> > >> >> the
> > >> > >> >>
> > >> > >> >>>> same location.
> > >> > >> >>>>
> > >> > >> >>>> I can achieve programaticaly using PhraseQueries
by adding
> the
> > >> Terms
> > >> > >> >>>> explicitly at the same position but I need
to be able to
> recover
> > >> the
> > >> > >> >>>> Payload
> > >> > >> >>>> data for each
> > >> > >> >>>> term found within the matched instance of
my query.
> > >> > >> >>>>
> > >> > >> >>>> Unfortunately the PayloadSpanUtil doesn't
seem to return the
> > >> same
> > >> > >> >>>>
> > >> > >> >> results
> > >> > >> >>
> > >> > >> >>>> as
> > >> > >> >>>> the PhraseQuery, possibly becuase it is
converting it inoto
> > >> Spans
> > >> > >> first
> > >> > >> >>>> which do not support searching for Terms
at the same
> document
> > >> > >> position?
> > >> > >> >>>>
> > >> > >> >>>> Any help appreciated.
> > >> > >> >>>>
> > >> > >> >>>> thanks,
> > >> > >> >>>>
> > >> > >> >>>> C>T>
> > >> > >> >>>>
> > >> > >> >>>> --
> > >> > >> >>>> TH!NKMAP
> > >> > >> >>>>
> > >> > >> >>>> Christopher Tignor | Senior Software Architect
> > >> > >> >>>> 155 Spring Street NY, NY 10012
> > >> > >> >>>> p.212-285-8600 x385 f.212-285-8999
> > >> > >> >>>>
> > >> > >> >>>>
> > >> > >> >>
> > >> ---------------------------------------------------------------------
> > >> > >> >> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> > >> > >> >> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > >> > >> >>
> > >> > >> >>
> > >> > >> >>
> > >> > >> >
> > >> > >> >
> > >> > >> >
> > >> > >>
> > >> > >>
> > >> > >>
> ---------------------------------------------------------------------
> > >> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> > >> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > >> > >>
> > >> > >>
> > >> > >
> > >> > >
> > >> > > --
> > >> > > TH!NKMAP
> > >> > >
> > >> > > Christopher Tignor | Senior Software Architect
> > >> > > 155 Spring Street NY, NY 10012
> > >> > > p.212-285-8600 x385 f.212-285-8999
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > TH!NKMAP
> > >> >
> > >> > Christopher Tignor | Senior Software Architect
> > >> > 155 Spring Street NY, NY 10012
> > >> > p.212-285-8600 x385 f.212-285-8999
> > >> >
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>
> > >>
> > >
> > >
> > > --
> > > TH!NKMAP
> > >
> > > Christopher Tignor | Senior Software Architect
> > > 155 Spring Street NY, NY 10012
> > > p.212-285-8600 x385 f.212-285-8999
> > >
> >
> >
> >
> > --
> > TH!NKMAP
> >
> > Christopher Tignor | Senior Software Architect
> > 155 Spring Street NY, NY 10012
> > p.212-285-8600 x385 f.212-285-8999
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
TH!NKMAP

Christopher Tignor | Senior Software Architect
155 Spring Street NY, NY 10012
p.212-285-8600 x385 f.212-285-8999

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message