lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Search query with wildcard and spaces
Date Tue, 31 Jul 2007 15:19:30 GMT
You're going to have to delve into the details of what the various analyzers
do. And perhaps write your own.

The syntax "something and"*, with the asterisk outside the quotes isn't
supported syntax as far as I know.

Adding quotes changes the syntax, so "some word*" is a phrase query,
which probably isn't what you want.

Some things to think about.... How do spaces play in your problem space?
What about capitalizations? One solution is to take total control over
the inputs, and index things exactly as you wish, for instance
substitute underbars for spaces, manually lowercase things (or writer a
filter
and/or analyzer) and index using WhitespaceAnalyzer. Do something similar
on searches.  Of course, you need to do a similar process for your search
terms.

You could also think about SpanFirstQuery, although you have to construct
that query yourself since, last I knew, there wasn't support for the Span
family in the QueryParser.

KeywordAnalyzer might help here.

What would definitely help you is to get a copy of Luke and try various
analyzers out to see how they parse the query strings you want to try with
various analyzers. This will give you a MUCH better idea of what actually
happens with the analyzers. Google lucene luke......

Best
Erick

On 7/31/07, jean-eric.cuendet@vd.ch <jean-eric.cuendet@vd.ch> wrote:
>
> > is this just one single example of different words that should
> > return the same results? You might consider implementing a synonym
> > analyzer otherwise.
>
> No, the query should match all of them.
> The query:
>   NAME:De Agos* AND FIRST:Maria
> should return 2 documents:
>   NAME: De agostino
>   FIRST: Maria
>
>   NAME: De agostato
>   FIRST: Maria
>
> Do I need synonym for that?
>
> > In your case, storing NAME as UN_TOKENIZED should enable your
> > NAME:"De Agos"* search
>
> The query:
>    NOM:"Zullo"* AND PRENOM:"Andrea"* AND DATE_NAISSANCE:19630622
> doesn't work:
>    Cannot parse 'NOM:"De Zullo"* AND PRENOM:"Andrea"* AND
> DATE_NAISSANCE:19630622': '*' or '?' not allowed as first character in
> WildcardQuery
>
> Any idea?
>
> > Regards Ard
> >
> > >
> > > Hi,
> > > I would like to make a search query that should match the following
> > > documents:
> > >
> > >    NAME: De agostino
> > >    FIRST: Maria
> > >
> > >    NAME: De agostato
> > >    FIRST: Maria
> > >
> > > How to design the query? The following:
> > >    NAME:De Agos* AND FIRST:Maria
> > > Doesn't work since there is a space in the name. And:
> > >    NAME:"De Agos"* AND FIRST:Maria
> > >    NAME:"De Agos*" AND FIRST:Maria
> > > doesn't work neither, since the double-quotes don't accept
> > > the wirldcard.
> > >
> > > Thanks for help
> > > -jec
> > >
> > > ---
> > > Jean-Eric Cuendet
> > > Etat de Vaud, Direction des systemes d'information (DSI)
> > > Av. Recordon 1, 1014 Lausanne
> > > Tel : +41 21 316 15 79 – Mob : +41 76 222 33 43
> > > mailto: jean-eric.cuendet@vd.ch - http://www.vd.ch
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message