lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatu Saloranta <>
Subject Re: Zero hits for queries ending with a number
Date Sat, 03 Apr 2004 16:24:02 GMT
On Saturday 03 April 2004 08:34, wrote:
> On Saturday 03 April 2004 17:11, Erik Hatcher wrote:
> > No objections that error messages and such could be made clearer.
> > Patches welcome!  Care to submit better error message handling in this
> > case?  Or perhaps allow lower-case "to"?
> I think the best would be if Lucene would simply have a
> setCaseSensitive(boolean).
> IMHO it's in any case a bad idea to make searches case-sensitive (per
> default).

I'd have to disagree. I think that  search engine core should not have to 
bother with details of character sets, such as lower-casing. Rules for 
lower/upper/initial/mixed case for all Unicode-languages are rather 
involved... and if you tried to do that, next thing would be whether 
accentuation and umlaut marks should matter or not (which is language 
dependant). That's why to me the natural way to go is to do direct 
comparison, ignoring case when executing queries. This does not prevent 
anyone from implementing such functionality (see below).

I think architecture and design of Lucene core is delightfully simple. One can 
easily create case-independent functionality by using proper analyzers, and 
(for the most part), configuring QueryParser. I would agree, however, that 
QueryParser is "victim of its success"; it's too often used in situations 
where one really should create proper GUI that builds the query. Backend code 
can then mangle input as it sees fit, and build query objects.
QueryParser is more natural for quick-n-dirty scenarios, where one just has to 
slap something together quickly, or if one only has textual interface to deal 
with. It's nice thing to have, but it has its limitations; there's no way to 
create one parser that's perfect for every use(r).

What could be done would be to make sure all examples / demo web apps would 
implement case-insensitive indexing and searching, since that is often what 
is needed?

-+ Tatu +-

> > But, also, folks need to really step back and practice basic
> > troubleshooting skills.  I asked you if that string was what you passed
> > to the QueryParser and you said yes, when in fact it was not.  And you
> I forgot that I did lower-case it. I fact I even output it in it's original
> state but lower-case it just before I pass it to lucene. That lower-casing
> is what I would call a hack and hence it's no surprise that I forgot it :-)
> Timo
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message