lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Tignor <ctig...@thinkmap.com>
Subject Re: SpanQuery for Terms at same position
Date Mon, 30 Nov 2009 14:19:40 GMT
It would take a bit of work work / learning (haven't used a RAMDirectory
yet) to make them into test cases usable by others and am deep into this
project and under the gun right now.  But if some time surfaces I will for
sure...

thanks -

C>T>

On Wed, Nov 25, 2009 at 7:49 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> Hmmm, are they unit tests? Or would you be wiling to create stand-alone
> unit tests demonstrating this and submit it as a patch?
>
> Best
> Erick@AlwaysTrollingForWorkFromOthers.Opportunistic.
>
> On Wed, Nov 25, 2009 at 5:38 PM, Christopher Tignor <ctignor@thinkmap.com
> >wrote:
>
> > my own tests with my own data show you are correct and the 1-n slop works
> > for matching terms at the same ordinal position.
> >
> > thanks!
> >
> > C>T>
> >
> > On Wed, Nov 25, 2009 at 4:25 PM, Paul Elschot <paul.elschot@xs4all.nl
> > >wrote:
> >
> > > Op woensdag 25 november 2009 21:20:33 schreef Christopher Tignor:
> > > > It's worth noting however that this -1 slop doesn't seem to work for
> > > cases
> > > > where oyu want to discover instances of more than two terms at the
> same
> > > > position.  Would be nice to be able to explicitly set this in the
> query
> > > > construction.
> > >
> > > I think requiring n terms at the same position would need a slop of
> 1-n,
> > > and I'd like to have some test cases added for that.
> > > Now if I only had some time...
> > >
> > > Regards,
> > > Paul Elschot
> > >
> > > >
> > > > thanks,
> > > >
> > > > C>T>
> > > > On Tue, Nov 24, 2009 at 9:17 AM, Christopher Tignor <
> > > ctignor@thinkmap.com>wrote:
> > > >
> > > > > yes that indeed works for me.
> > > > >
> > > > > thanks,
> > > > >
> > > > > C>T>
> > > > >
> > > > >
> > > > > On Mon, Nov 23, 2009 at 5:50 PM, Paul Elschot <
> > paul.elschot@xs4all.nl
> > > >wrote:
> > > > >
> > > > >> Op maandag 23 november 2009 20:07:58 schreef Christopher Tignor:
> > > > >> > Also, I noticed that with the above edit to NearSpansOrdered
I
> am
> > > > >> getting
> > > > >> > erroneous results fo normal ordered searches using searches
> like:
> > > > >> >
> > > > >> > "_n" followed by "work"
> > > > >> >
> > > > >> > where because "_n" and "work" are at the same position the
code
> > > changes
> > > > >> > accept their pairing as a valid in-order result now that
the
> eqaul
> > > to
> > > > >> clause
> > > > >> > has been added to the inequality.
> > > > >>
> > > > >> Thanks for trying this. Indeed the "followed by" semantics is
> broken
> > > for
> > > > >> the ordered case when spans at the same positions are considered
> > > > >> ordered.
> > > > >>
> > > > >> Did I understand correctly that the unordered case with a slop
of
> -1
> > > > >> and without the edit works to match terms at the same position?
> > > > >> In that case it may be worthwhile to add that to the javadocs,
> > > > >> and also add a few testcases.
> > > > >>
> > > > >> Regards,
> > > > >> Paul Elschot
> > > > >>
> > > > >> >
> > > > >> > C>T>
> > > > >> >
> > > > >> > On Mon, Nov 23, 2009 at 12:26 PM, Christopher Tignor
> > > > >> > <ctignor@thinkmap.com>wrote:
> > > > >> >
> > > > >> > > Thanks so much for this.
> > > > >> > >
> > > > >> > > Using an un-ordered query, the -1 slop indeed returns
the
> > correct
> > > > >> results,
> > > > >> > > matching tokens at the same position.
> > > > >> > >
> > > > >> > > I tried the same query but ordered both after and before
> > > rebuilding
> > > > >> the
> > > > >> > > source with Paul's changes to NearSpansOrdered but
the query
> was
> > > still
> > > > >> > > failing, returning no results.
> > > > >> > >
> > > > >> > > C>T>
> > > > >> > >
> > > > >> > >
> > > > >> > > On Mon, Nov 23, 2009 at 11:59 AM, Mark Miller <
> > > markrmiller@gmail.com
> > > > >> >wrote:
> > > > >> > >
> > > > >> > >> Your trying -1 with ordered right? Try it with
non ordered.
> > > > >> > >>
> > > > >> > >> Christopher Tignor wrote:
> > > > >> > >> > A slop of -1 doesn't work either.  I get no
results
> returned.
> > > > >> > >> >
> > > > >> > >> > this would be a *really* helpful feature for
me if someone
> > > might
> > > > >> suggest
> > > > >> > >> an
> > > > >> > >> > implementation as I would really like to be
able to do
> > > arbitrary
> > > > >> span
> > > > >> > >> > searches where tokens may be at the same position
and also
> in
> > > other
> > > > >> > >> > positions where the ordering of subsequent
terms may be
> > > restricted
> > > > >> as
> > > > >> > >> per
> > > > >> > >> > the normal span API.
> > > > >> > >> >
> > > > >> > >> > thanks,
> > > > >> > >> >
> > > > >> > >> > C>T>
> > > > >> > >> >
> > > > >> > >> > On Sun, Nov 22, 2009 at 7:50 AM, Paul Elschot
<
> > > > >> paul.elschot@xs4all.nl
> > > > >> > >> >wrote:
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >> Op zondag 22 november 2009 04:47:50 schreef
Adriano
> > Crestani:
> > > > >> > >> >>
> > > > >> > >> >>> Hi,
> > > > >> > >> >>>
> > > > >> > >> >>> I didn't test, but you might want
to try SpanNearQuery
> and
> > > set
> > > > >> slop to
> > > > >> > >> >>>
> > > > >> > >> >> zero.
> > > > >> > >> >>
> > > > >> > >> >>> Give it a try and let me know if it
worked.
> > > > >> > >> >>>
> > > > >> > >> >> The slop is the number of positions "in
between", so zero
> > > would
> > > > >> still
> > > > >> > >> be
> > > > >> > >> >> too
> > > > >> > >> >> much to only match at the same position.
> > > > >> > >> >>
> > > > >> > >> >> SpanNearQuery may or may not work for
a slop of -1, but
> one
> > > could
> > > > >> try
> > > > >> > >> >> that for both the ordered and unordered
cases.
> > > > >> > >> >> One way to do that is to start from the
existing test
> cases.
> > > > >> > >> >>
> > > > >> > >> >> Regards,
> > > > >> > >> >> Paul Elschot
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >>> Regards,
> > > > >> > >> >>> Adriano Crestani
> > > > >> > >> >>>
> > > > >> > >> >>> On Thu, Nov 19, 2009 at 7:28 PM, Christopher
Tignor <
> > > > >> > >> >>>
> > > > >> > >> >> ctignor@thinkmap.com>wrote:
> > > > >> > >> >>
> > > > >> > >> >>>> Hello,
> > > > >> > >> >>>>
> > > > >> > >> >>>> I would like to search for all
documents that contain
> both
> > > > >> "plan" and
> > > > >> > >> >>>>
> > > > >> > >> >> "_v"
> > > > >> > >> >>
> > > > >> > >> >>>> (my part of speech token for verb)
at the same position.
> > > > >> > >> >>>> I have tokenized the documents
accordingly so these
> tokens
> > > > >> exists at
> > > > >> > >> >>>>
> > > > >> > >> >> the
> > > > >> > >> >>
> > > > >> > >> >>>> same location.
> > > > >> > >> >>>>
> > > > >> > >> >>>> I can achieve programaticaly using
PhraseQueries by
> adding
> > > the
> > > > >> Terms
> > > > >> > >> >>>> explicitly at the same position
but I need to be able to
> > > recover
> > > > >> the
> > > > >> > >> >>>> Payload
> > > > >> > >> >>>> data for each
> > > > >> > >> >>>> term found within the matched
instance of my query.
> > > > >> > >> >>>>
> > > > >> > >> >>>> Unfortunately the PayloadSpanUtil
doesn't seem to return
> > the
> > > > >> same
> > > > >> > >> >>>>
> > > > >> > >> >> results
> > > > >> > >> >>
> > > > >> > >> >>>> as
> > > > >> > >> >>>> the PhraseQuery, possibly becuase
it is converting it
> > inoto
> > > > >> Spans
> > > > >> > >> first
> > > > >> > >> >>>> which do not support searching
for Terms at the same
> > > document
> > > > >> > >> position?
> > > > >> > >> >>>>
> > > > >> > >> >>>> Any help appreciated.
> > > > >> > >> >>>>
> > > > >> > >> >>>> thanks,
> > > > >> > >> >>>>
> > > > >> > >> >>>> C>T>
> > > > >> > >> >>>>
> > > > >> > >> >>>> --
> > > > >> > >> >>>> TH!NKMAP
> > > > >> > >> >>>>
> > > > >> > >> >>>> Christopher Tignor | Senior Software
Architect
> > > > >> > >> >>>> 155 Spring Street NY, NY 10012
> > > > >> > >> >>>> p.212-285-8600 x385 f.212-285-8999
> > > > >> > >> >>>>
> > > > >> > >> >>>>
> > > > >> > >> >>
> > > > >>
> > ---------------------------------------------------------------------
> > > > >> > >> >> To unsubscribe, e-mail:
> > > java-user-unsubscribe@lucene.apache.org
> > > > >> > >> >> For additional commands, e-mail:
> > > java-user-help@lucene.apache.org
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >>
> > > > >> > >>
> > > > >> > >>
> > > ---------------------------------------------------------------------
> > > > >> > >> To unsubscribe, e-mail:
> > java-user-unsubscribe@lucene.apache.org
> > > > >> > >> For additional commands, e-mail:
> > > java-user-help@lucene.apache.org
> > > > >> > >>
> > > > >> > >>
> > > > >> > >
> > > > >> > >
> > > > >> > > --
> > > > >> > > TH!NKMAP
> > > > >> > >
> > > > >> > > Christopher Tignor | Senior Software Architect
> > > > >> > > 155 Spring Street NY, NY 10012
> > > > >> > > p.212-285-8600 x385 f.212-285-8999
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> > TH!NKMAP
> > > > >> >
> > > > >> > Christopher Tignor | Senior Software Architect
> > > > >> > 155 Spring Street NY, NY 10012
> > > > >> > p.212-285-8600 x385 f.212-285-8999
> > > > >> >
> > > > >>
> > > > >>
> > ---------------------------------------------------------------------
> > > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > TH!NKMAP
> > > > >
> > > > > Christopher Tignor | Senior Software Architect
> > > > > 155 Spring Street NY, NY 10012
> > > > > p.212-285-8600 x385 f.212-285-8999
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > TH!NKMAP
> > > >
> > > > Christopher Tignor | Senior Software Architect
> > > > 155 Spring Street NY, NY 10012
> > > > p.212-285-8600 x385 f.212-285-8999
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> >
> > --
> > TH!NKMAP
> >
> > Christopher Tignor | Senior Software Architect
> > 155 Spring Street NY, NY 10012
> > p.212-285-8600 x385 f.212-285-8999
> >
>



-- 
TH!NKMAP

Christopher Tignor | Senior Software Architect
155 Spring Street NY, NY 10012
p.212-285-8600 x385 f.212-285-8999

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message