lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Plater" <jpla...@healthmarketscience.com>
Subject RE: Edit distance and wildcard searching with PhraseQuery
Date Thu, 12 Nov 2009 00:48:52 GMT
Thanks for the suggestion - I double checked the case and it was OK.
Turned out I needed to use the StandardAnalyzer instead of the
WhitespaceAnalyzer.

-Jeff

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Wednesday, November 11, 2009 6:52 PM
To: java-user@lucene.apache.org
Subject: Re: Edit distance and wildcard searching with PhraseQuery

I'd at use something that lowercases the input rather than just
WhitespaceAnalyzer. Remember to use it at index time and query time.
Between
your queries and typing things in e-mails, case is often a gotcha.

At least carefully check that your casing is identical.

Best
Erick

On Wed, Nov 11, 2009 at 6:41 PM, Jeff Plater <
jplater@healthmarketscience.com> wrote:

> Thanks - I tried it out and it seems to work for "Philadelphid~0.75
PA" but
> I can't get it working for "Phil* PA" yet.  Perhaps it is an issue
with my
> Analyzer (I am using WhitespaceAnalyzer)?.  Have you used it with
wildcard
> before?
>
> -Jeff
>
> -----Original Message-----
> From: AHMET ARSLAN [mailto:iorixxx@yahoo.com]
> Sent: Wednesday, November 11, 2009 5:55 PM
> To: java-user@lucene.apache.org
> Subject: Re: Edit distance and wildcard searching with PhraseQuery
>
> What you are looking for is ComplexPhraseQueryParser [1] and
implemented in
> Lucene 2.9.0. It uses SpanQuery family.
> It supports "Phil* PA"~10 as well as "Philadelphid~0.75 PA".
> Ranges, OR, fuzzy and wildcard inside proximity (phrases).
>
>
> [1]
>
http://lucene.apache.org/java/2_9_0/api/contrib-misc/org/apache/lucene/q
ueryParser/complexPhrase/package-summary.html
>
> [2] https://issues.apache.org/jira/browse/LUCENE-1486
>
>
> > I am trying to figure out a way that I can query a Lucene
> > index for a
> > phrase but have some fuzziness (edit distance and/or
> > wildcard) applied
> > to the individual terms.  An example should help
> > explain what I am
> > trying to do:
> >
> >
> >
> > Index contains:
> >
> > Philadelphia PA
> >
> >
> >
> > Search is done on:
> >
> > Philadelphid PA
> >
> >
> >
> > I want it to result in a hit - basically something like
> > "Philadelphid~0.75 PA" (that syntax is not valid but
> > explains what I am
> > looking for).  Similarly, I would like to be able to
> > do something like
> > "Phil* PA" and get a hit as well.
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message