lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: svn commit: r921480 [1/8] - in /lucene/java/trunk: ./ contrib/analyzers/common/src/test/org/apache/lucene/analysis/query/ contrib/analyzers/common/src/test/org/apache/lucene/analysis/shingle/ contrib/ant/src/java/org/apache/lucene/ant/ contrib/be
Date Wed, 10 Mar 2010 18:52:19 GMT
On 03/10/2010 01:48 PM, Robert Muir wrote:
> On Wed, Mar 10, 2010 at 1:40 PM, Shai Erera<>  wrote:
>> I wrote that I defaulted to Whitespace for convenience reasons only. Now you
>> don't need to specify anything if you don't care how the content is indexed,
>> which is really the case for TONS of tests. The code became so much simpler.
> I guess I don't see it this way. It may be convenient for us, but its
> inconvenient
> for new users, as they see it as 'lucene's default'. No one wants to do more
> work than is necessary: currently a lot of people use StandardAnalyzer for this
> reason, maybe without a lot of thought. but this is ok.
> StandardAnalyzer at least
> does things like lowercasing.
>> For those who do care, they anyway pay attention to it :).
> I see it as the inverse: I would rather our tests have "new WhitespaceAnalyzer"
> than see users complain on java-user mailing list that lucene doesn't
> ignore case
> differences or punctuation, because they don't need to think about this.
> Whitespace is a shitty default for a search engine, its only good for tests.

+1. I don't think we should default an Analyzer. I agree that 
WhiteSpaceAnalyzer is not a good default. And I don't think
Standard is a good default. I'm in agreement that you should have to 
specify to force thinking about it.

- Mark

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message