lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Partial Word Matches
Date Sat, 11 Nov 2006 22:50:38 GMT
See below.....

On 11/11/06, Storey, Jeff <jeff.storey@dac.us> wrote:
>
> I stand corrected -- I am NOT getting partial matches, I was extracting
> partial matches from the text programmatically and thought that's what
> was being returned.
>
> On another topic, regarding Boolean queries and wildcard queries, I have
> two questions:
>
> It seems like when I enter the query "ball AND basket" it returns
> different results than "ball and basket." Is there a way to make the
> Boolean operators case insensitive?


No. I've found this kind of odd too. Here's a snippet...

Boolean operators allow terms to be combined through logic operators. Lucene
supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean
operators must be ALL CAPS).

from http://sewm.pku.edu.cn/src/clucene/doc/queryparsersyntax.html



As for wildcard searches, one of the things I attempt to do is pick out
> the words that caused a particular file to be returned. However, when I
> search for the term "yellow~" I might get something like "bellow." Is
> there a way to list what Lucene found in the document that made it
> relevant?


Well, first ~ isn't a wildcard, it's a "fuzzy search" (aka similar terms).
So getting "bellow" for "yellow~" is expected. Although, somewhat
confusingly, "lemon orange"~10 is a proximity search.

* and ? are the wildcard characters.

Searcher.explain() is probably your friend, although I haven't used it much,
I've certainly seen it mentioned enough......

If you haven't, get a copy of Luke (google lucene luke). It's a program that
allows you to explore your indexes, explain queries, explore the effects of
different analyzers, etc. Really, really, really get a copy <G>....


Thanks for all the help.
>
> Jeff
>
> -----Original Message-----
> From: Paul Borgermans [mailto:paul.borgermans@gmail.com]
> Sent: Saturday, November 11, 2006 3:06 PM
> To: java-user@lucene.apache.org
> Subject: Re: Partial Word Matches
>
> Indeed, the only way this can happen as far as I know Lucene is by using
> a
> stemmer during indexing, the standard analyzer won't result in such
> behaviour.
>
> hth
>
> Paul
>
> On 11/11/06, Erick Erickson <erickerickson@gmail.com> wrote:
> >
> > That's not the default behavior, so I'm perplexed. Normally, you have
> to
> > go
> > to considerable effort to get partial matches....
> >
> > What analyzers are you using at both index and query time? Perhaps as
> > short
> > a code snippet as you could make showing this behavior would be a good
> > thing
> > to post. I flat guarantee folks will look at it. But please make it
> short
> > <G>.
> >
> > Best
> > ERick
> >
> > On 11/11/06, Storey, Jeff <jeff.storey@dac.us> wrote:
> > >
> > > Hi. I'm using Lucene to do some searching (using the Searcher object
> and
> > > passing it a ParsedQuery). I search for a word such as "long" and it
> is
> > > returning partial matches, such as "belong" and "along." Is there a
> way
> > > to turn off this behavior and only match whole words?
> > >
> > >
> > >
> > > Thank you,
> > >
> > > Jeff
> > >
> > >
> > >
> > >
> > >
> >
> >
>
>
> --
> http://walhalla.wordpress.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message