lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Tignor <ctig...@thinkmap.com>
Subject Re: SpanQuery for Terms at same position
Date Wed, 25 Nov 2009 20:20:33 GMT
It's worth noting however that this -1 slop doesn't seem to work for cases
where oyu want to discover instances of more than two terms at the same
position.  Would be nice to be able to explicitly set this in the query
construction.

thanks,

C>T>
On Tue, Nov 24, 2009 at 9:17 AM, Christopher Tignor <ctignor@thinkmap.com>wrote:

> yes that indeed works for me.
>
> thanks,
>
> C>T>
>
>
> On Mon, Nov 23, 2009 at 5:50 PM, Paul Elschot <paul.elschot@xs4all.nl>wrote:
>
>> Op maandag 23 november 2009 20:07:58 schreef Christopher Tignor:
>> > Also, I noticed that with the above edit to NearSpansOrdered I am
>> getting
>> > erroneous results fo normal ordered searches using searches like:
>> >
>> > "_n" followed by "work"
>> >
>> > where because "_n" and "work" are at the same position the code changes
>> > accept their pairing as a valid in-order result now that the eqaul to
>> clause
>> > has been added to the inequality.
>>
>> Thanks for trying this. Indeed the "followed by" semantics is broken for
>> the ordered case when spans at the same positions are considered
>> ordered.
>>
>> Did I understand correctly that the unordered case with a slop of -1
>> and without the edit works to match terms at the same position?
>> In that case it may be worthwhile to add that to the javadocs,
>> and also add a few testcases.
>>
>> Regards,
>> Paul Elschot
>>
>> >
>> > C>T>
>> >
>> > On Mon, Nov 23, 2009 at 12:26 PM, Christopher Tignor
>> > <ctignor@thinkmap.com>wrote:
>> >
>> > > Thanks so much for this.
>> > >
>> > > Using an un-ordered query, the -1 slop indeed returns the correct
>> results,
>> > > matching tokens at the same position.
>> > >
>> > > I tried the same query but ordered both after and before rebuilding
>> the
>> > > source with Paul's changes to NearSpansOrdered but the query was still
>> > > failing, returning no results.
>> > >
>> > > C>T>
>> > >
>> > >
>> > > On Mon, Nov 23, 2009 at 11:59 AM, Mark Miller <markrmiller@gmail.com
>> >wrote:
>> > >
>> > >> Your trying -1 with ordered right? Try it with non ordered.
>> > >>
>> > >> Christopher Tignor wrote:
>> > >> > A slop of -1 doesn't work either.  I get no results returned.
>> > >> >
>> > >> > this would be a *really* helpful feature for me if someone might
>> suggest
>> > >> an
>> > >> > implementation as I would really like to be able to do arbitrary
>> span
>> > >> > searches where tokens may be at the same position and also in
other
>> > >> > positions where the ordering of subsequent terms may be restricted
>> as
>> > >> per
>> > >> > the normal span API.
>> > >> >
>> > >> > thanks,
>> > >> >
>> > >> > C>T>
>> > >> >
>> > >> > On Sun, Nov 22, 2009 at 7:50 AM, Paul Elschot <
>> paul.elschot@xs4all.nl
>> > >> >wrote:
>> > >> >
>> > >> >
>> > >> >> Op zondag 22 november 2009 04:47:50 schreef Adriano Crestani:
>> > >> >>
>> > >> >>> Hi,
>> > >> >>>
>> > >> >>> I didn't test, but you might want to try SpanNearQuery
and set
>> slop to
>> > >> >>>
>> > >> >> zero.
>> > >> >>
>> > >> >>> Give it a try and let me know if it worked.
>> > >> >>>
>> > >> >> The slop is the number of positions "in between", so zero
would
>> still
>> > >> be
>> > >> >> too
>> > >> >> much to only match at the same position.
>> > >> >>
>> > >> >> SpanNearQuery may or may not work for a slop of -1, but one
could
>> try
>> > >> >> that for both the ordered and unordered cases.
>> > >> >> One way to do that is to start from the existing test cases.
>> > >> >>
>> > >> >> Regards,
>> > >> >> Paul Elschot
>> > >> >>
>> > >> >>
>> > >> >>> Regards,
>> > >> >>> Adriano Crestani
>> > >> >>>
>> > >> >>> On Thu, Nov 19, 2009 at 7:28 PM, Christopher Tignor <
>> > >> >>>
>> > >> >> ctignor@thinkmap.com>wrote:
>> > >> >>
>> > >> >>>> Hello,
>> > >> >>>>
>> > >> >>>> I would like to search for all documents that contain
both
>> "plan" and
>> > >> >>>>
>> > >> >> "_v"
>> > >> >>
>> > >> >>>> (my part of speech token for verb) at the same position.
>> > >> >>>> I have tokenized the documents accordingly so these
tokens
>> exists at
>> > >> >>>>
>> > >> >> the
>> > >> >>
>> > >> >>>> same location.
>> > >> >>>>
>> > >> >>>> I can achieve programaticaly using PhraseQueries by
adding the
>> Terms
>> > >> >>>> explicitly at the same position but I need to be able
to recover
>> the
>> > >> >>>> Payload
>> > >> >>>> data for each
>> > >> >>>> term found within the matched instance of my query.
>> > >> >>>>
>> > >> >>>> Unfortunately the PayloadSpanUtil doesn't seem to
return the
>> same
>> > >> >>>>
>> > >> >> results
>> > >> >>
>> > >> >>>> as
>> > >> >>>> the PhraseQuery, possibly becuase it is converting
it inoto
>> Spans
>> > >> first
>> > >> >>>> which do not support searching for Terms at the same
document
>> > >> position?
>> > >> >>>>
>> > >> >>>> Any help appreciated.
>> > >> >>>>
>> > >> >>>> thanks,
>> > >> >>>>
>> > >> >>>> C>T>
>> > >> >>>>
>> > >> >>>> --
>> > >> >>>> TH!NKMAP
>> > >> >>>>
>> > >> >>>> Christopher Tignor | Senior Software Architect
>> > >> >>>> 155 Spring Street NY, NY 10012
>> > >> >>>> p.212-285-8600 x385 f.212-285-8999
>> > >> >>>>
>> > >> >>>>
>> > >> >>
>> ---------------------------------------------------------------------
>> > >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >
>> > >> >
>> > >> >
>> > >>
>> > >>
>> > >> ---------------------------------------------------------------------
>> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> > >>
>> > >>
>> > >
>> > >
>> > > --
>> > > TH!NKMAP
>> > >
>> > > Christopher Tignor | Senior Software Architect
>> > > 155 Spring Street NY, NY 10012
>> > > p.212-285-8600 x385 f.212-285-8999
>> > >
>> >
>> >
>> >
>> > --
>> > TH!NKMAP
>> >
>> > Christopher Tignor | Senior Software Architect
>> > 155 Spring Street NY, NY 10012
>> > p.212-285-8600 x385 f.212-285-8999
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> TH!NKMAP
>
> Christopher Tignor | Senior Software Architect
> 155 Spring Street NY, NY 10012
> p.212-285-8600 x385 f.212-285-8999
>



-- 
TH!NKMAP

Christopher Tignor | Senior Software Architect
155 Spring Street NY, NY 10012
p.212-285-8600 x385 f.212-285-8999

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message