lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: why query chinese character with bracket become phrase query by default?
Date Sun, 15 May 2011 15:58:27 GMT
I opened https://issues.apache.org/jira/browse/SOLR-2519 for this.

Mike

http://blog.mikemccandless.com

On Sun, May 15, 2011 at 8:02 AM, Michael McCandless
<lucene@mikemccandless.com> wrote:
> On Fri, May 6, 2011 at 8:49 AM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>
>> Shouldn't we  have field types in the eg schema for the different
>> languages?  Ie, text_zh, text_th, text_en, text_ja, text_nl, etc.
>
> In fact, until we break out dedicated language field types, shouldn't
> we default autophrase to off in Solr?
>
> I think this is what ElasticSearch does (just inherits Lucene's
> default for this) -- Shay, or any ElasticSearch users out there... can
> you confirm?
>
> Leaving autophrase on is catastrophic for non-whitespace languages
> (CJK and others), and at best iffy for whitespace languages (ie,
> unexpected that the QueryParser would make a PhraseQuery when user
> hadn't asked for one, not clear it really helps relevance for
> whitespace languages, definitely hurts performance), so leaving it is
> doing far more damage than good, as far as I can tell.
>
> Any objections to turning off autophrase by default in Solr, until we
> have per-language field types?
>
> Mike
>
> http://blog.mikemccandless.com
>

Mime
View raw message