lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: A question regarding the setSlop method of class PhraseQuery (Lucene version 3.0.1)
Date Mon, 28 Jun 2010 13:20:08 GMT
I think you're misunderstanding the intent of PhraseQueries and slop. Slop
is the number of intervening tokens that may exist between the words
you're looking for. However, all the words you're looking for MUST exist.
So,

<<< whenever the search phrase contains a word that don't
exist in the document, the search result will be empty >>>

is exactly how this is intended to work.

HTH
Erick


On Mon, Jun 28, 2010 at 9:09 AM, a peng <zhoudengpeng@gmail.com> wrote:

> Hi,
>
> My test result is that whenever the search phrase contains a word that
> don't
> exist in the document, the search result will be empty no matter how big
> the
> slop factor I set, seems this is a bug of Lucene, or it is work as design?
>
> 2010/6/28 tarun sapra <t.sapra97@gmail.com>
>
> > Hi ,
> >
> > I think I have been able to understand whats happening here...
> >
> > Indexed Content : "This is a test".
> > your search phrase : "This is a formal test"
> > your setting the slop factor 2 , now if your slop factor is 3 it should
> > work
> > because "is" and "a" are stop words thus the words "This" and "test" are
> 2
> > slop factor apart but in your search phrase "This is a formal test" the
> > words "This" and "test"  are 3 slop factor thats why it's nor working
> > now in search phrase "This is formal test" the words "This" and "test"
> are
> > 2
> > slop factor apart thats why this phrase is working.
> >
> >
> >
> > On Mon, Jun 28, 2010 at 11:37 AM, a peng <zhoudengpeng@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I am using StandardAnalyzer(Version.LUCENE_30);
> > >
> > > 2010/6/27 tarun sapra <t.sapra97@gmail.com>
> > >
> > > > which analyzer are you usin'?
> > > >
> > > >
> > > > On Sun, Jun 27, 2010 at 7:12 AM, a peng <zhoudengpeng@gmail.com>
> > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I know the indexed content contains the following text: "This is
a
> > > test".
> > > > > And the search phrase I used is "This is a formal test", and then
I
> > set
> > > > the
> > > > > slop of the PhraseQuery as 2 with setSlop(2), but I found that I
> can
> > > not
> > > > > get
> > > > > a search result. If I set the search phrase as "This is formal
> test",
> > > > then
> > > > > I
> > > > > can get the search result.
> > > > >
> > > > > So what is the problem here, thanks in advance.
> > > > >
> > > > >
> > > > > Attached is the Java doc for the setSlop method:
> > > > >
> > > > > public void *setSlop*(int s)
> > > > >
> > > > > Sets the number of other words permitted between words in query
> > phrase.
> > > > If
> > > > > zero, then this is an exact phrase search. For larger values this
> > works
> > > > > like
> > > > > a WITHIN or NEAR operator.
> > > > >
> > > > > The slop is in fact an edit-distance, where the units correspond
to
> > > moves
> > > > > of
> > > > > terms in the query phrase out of position. For example, to switch
> the
> > > > order
> > > > > of two words requires two moves (the first move places the words
> atop
> > > one
> > > > > another), so to permit re-orderings of phrases, the slop must be
at
> > > least
> > > > > two.
> > > > >
> > > > > More exact matches are scored higher than sloppier matches, thus
> > search
> > > > > results are sorted by exactness.
> > > > >
> > > > > The slop is zero by default, requiring exact matches.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks & Regards
> > > > Tarun Sapra
> > > >
> > >
> >
> >
> >
> > --
> > Thanks & Regards
> > Tarun Sapra
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message