lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: case insensitivity
Date Wed, 25 Jun 2008 13:02:23 GMT
Well, it depends on what you mean by "per term". There's already
PerFieldAnalyzerWrapper for each field, but I don't think that's what
you want.

How do you expect a per term analyzer to behave? I'm having a hard
time thinking of a use case that's general. You could always
roll your own analyzer that didn't change case for your particular
list of words.

But the problem is your users. In your example, suppose a user
typed in "dell computers". Would that match "Dell computers"?
Does your analyzer automatically upper-case some words? If it
does, that's the same as lower casing them all. If it doesn't,
how do you explain that to your users?

All in all, I'm having a tough time imagining how this would work.
It's easy enough to say "let's assume", but I suspect that
whatever solution satisfied your example will have its own problems
that are far worse than just lower-casing things.


On Wed, Jun 25, 2008 at 5:37 AM, John Byrne <> wrote:

> Hi,
> I know that case-insensitive searching is normally done by creating an
> all-lower-case version of the documents, and turning the search terms into
> lower case whenever this field is searched, but this approach has it's
> disadvantages.
> Let's say, for example, you want to find "Dell" (with a capital "D"), near
> "computers" (with or without capitals, ie. in any case). The problem is that
> you would need to use a SpanQuery to find terms near each other; but if the
> case-sensitivity required is different for each term, then they will be in
> different fields, making the use of SpanQuerys inpossible.
> There might be ways to work around this, but my question is: will
> case-insensitvity ever be added to Lucene as per-Term option? If not, can
> anyone tell me where I should start looking in order to make this change
> myself?
> Thanks!
> -JB
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message