lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: About The Lucene Query Syntax
Date Mon, 15 Sep 2008 14:25:03 GMT
The unsatisfactory answer is "because that's the
way it works".

I suspect that the underlying issue is what happens when
you try to expand phrase searches via wildcards. Wildcard
searches are already plagued by "TooManyClauses" exceptions,
which would only get worse with phrases In fact, downright
impossible.

You'd have to find all the wildcards that matched. Then find
all the words that were one position less. Then form a huge
OR query of all these phrases. Which would be really ugly.
And that's only a single wildcard in a phrase. Imagine trying
to form the query of "a* b* c*". The OR clause would be
the cross product of all terms that started with one of the
three letters.

You could possibly trim that down by finding the offsets in
each document for all the terms that began with any of
the three letters and "restricting" the massive OR query
by the term positions of each word in each document
and then forming them into a huge query.... But you
see where this is going. To do any of this is prohibitively
expensive.

Take all this with a grain of salt since I haven't been in the
guts of the Lucene search engine, but I suspect that
the very bright folks who've coded this whole thing up
would have done this long ago if it was reasonable.

Best
Erick


2008/9/15 M. Fatih Soydan <idlemyth@gmail.com>

> I read. But i didn't understand why not ?
>
> 15 Eylül 2008 Pazartesi 16:56 tarihinde Erick Erickson
> <erickerickson@gmail.com> yazdı:
> > wildcards are NOT supported within double quotes, so if
> > you are submitting your query
> > "Technology Gunlugu*"
> > WITH the double quotes, you are searching for
> > that literal phrase.
> >
> > Best
> > Erick
> >
> > P.S. See:
> >
> > http://lucene.apache.org/java/docs/queryparsersyntax.html
> > the first line under "wildcard searches"
> >
> >
> > 2008/9/15 Fatih Soydan <idlemyth@gmail.com>
> >
> >> Hi;
> >>
> >>
> >>
> >> I am trying to write an application that's working on Blackberry or
> other
> >> java enabled phones. This application talk with the server and Ask some
> >> questions. Server Side is c# and i am using Apache Lucene.Net in this
> >> Project.
> >>
> >>
> >>
> >> I searched a forum or mail list, but i didn't found yet. I have a
> problem
> >> about query syntax.
> >>
> >>
> >>
> >> I want to search this
> >>
> >> "Technology Gunlugu*" AND "NTV"
> >>
> >>
> >>
> >> But don't returns any result, Because of "technology gunlugu*".
> >>
> >> When I searched
> >>
> >> "Technology Gunlugu" AND "NTV"   returns 3 matched record
> >>
> >> "Technology Gunlugunde" AND "NTV" returns  1 matched record
> >>
> >>
> >>
> >>
> >>
> >> I debug my Project step by step.
> >>
> >>
> >>
> >> In the  Lucene.Net.Search.IndexSearcher
> >>
> >> public override Query Rewrite(Query original)
> >>
> >> {
> >>
> >> Query query = original;
> >>
> >> for (Query rewrittenQuery = query.Rewrite(reader); rewrittenQuery !=
> query;
> >> rewrittenQuery = query.Rewrite(reader))
> >>
> >>       {
> >>
> >>              query = rewrittenQuery;
> >>
> >>       }
> >>
> >>       return query;
> >>
> >> }
> >>
> >>
> >>
> >> İf the query is a PrefixQuery  Gunlugu* turns to Gunlugunde OR Gunlugu
> >>
> >> But if the query is a default Query (Lucene.Net.Search.Query)
> "technology
> >> gunlugu*"  it returns null query.
> >>
> >>
> >>
> >>
> >>
> >> What can I do ?
> >>
> >>
> >>
> >> (Sorry for my bad English)
> >>
> >>
> >>
> >> FATIH SOYDAN
> >>
> >>
> >>
> >>
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message