lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: svn commit: r921480 [1/8] - in /lucene/java/trunk: ./ contrib/analyzers/common/src/test/org/apache/lucene/analysis/query/ contrib/analyzers/common/src/test/org/apache/lucene/analysis/shingle/ contrib/ant/src/java/org/apache/lucene/ant/ contrib/be
Date Wed, 10 Mar 2010 18:48:59 GMT
On Wed, Mar 10, 2010 at 1:40 PM, Shai Erera <> wrote:
> I wrote that I defaulted to Whitespace for convenience reasons only. Now you
> don't need to specify anything if you don't care how the content is indexed,
> which is really the case for TONS of tests. The code became so much simpler.

I guess I don't see it this way. It may be convenient for us, but its
for new users, as they see it as 'lucene's default'. No one wants to do more
work than is necessary: currently a lot of people use StandardAnalyzer for this
reason, maybe without a lot of thought. but this is ok.
StandardAnalyzer at least
does things like lowercasing.

> For those who do care, they anyway pay attention to it :).

I see it as the inverse: I would rather our tests have "new WhitespaceAnalyzer"
than see users complain on java-user mailing list that lucene doesn't
ignore case
differences or punctuation, because they don't need to think about this.

Whitespace is a shitty default for a search engine, its only good for tests.

Robert Muir

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message