lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr 5.0 - uniqueKey case insensitive ?
Date Wed, 06 May 2015 00:55:40 GMT
Well, "working fine" may be a bit of an overstatement. That has never
been officially supported, so it "just happened" to work in 3.6.

As Chris points out, if you're using SolrCloud then this will _not_
work as routing happens early in the process, i.e. before the analysis
chain gets the token so various copies of the doc will exist on
different shards.

Best,
Erick

On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina <bmannina@free.fr> wrote:
> Hello Chris,
>
> yes I confirm on my SOLR3.6 it works fine since several years, and each doc
> added with same code is updated not added.
>
> To be more clear, I receive docs with a field name "pn" and it's the
> uniqueKey, and it always in uppercase
>
> so I must define in my schema.xml
>
>     <field name="id" type="string" multiValued="false" indexed="true"
> required="true" stored="true"/>
>     <field name="pn" type="text_general" multiValued="true" indexed="true"
> stored="false"/>
> ...
>    <uniqueKey>id</uniqueKey>
> ...
>   <copyField source="id" dest="pn"/>
>
> but the application that use solr already exists so it requests with pn
> field not id, i cannot change that.
> and in each docs I receive, there is not id field, just pn field, and  i
> cannot also change that.
>
> so there is a problem no ? I must import a id field and request a pn field,
> but I have a pn field only for import...
>
>
>
> Le 05/05/2015 01:00, Chris Hostetter a écrit :
>>
>> : On SOLR3.6, I defined a string_ci field like this:
>> :
>> : <fieldType name="string_ci" class="solr.TextField"
>> : sortMissingLast="true" omitNorms="true">
>> :     <analyzer>
>> :       <tokenizer class="solr.KeywordTokenizerFactory"/>
>> :       <filter class="solr.LowerCaseFilterFactory"/>
>> :     </analyzer>
>> :     </fieldType>
>> :
>> : <field name="pn" type="string_ci" multiValued="false" indexed="true"
>> : required="true" stored="true"/>
>>
>>
>> I'm really suprised that field would have worked for you (reliably) as a
>> uniqueKey field even in Solr 3.6.
>>
>> the best practice for something like what you describe has always (going
>> back to Solr 1.x) been to use a copyField to create a case insensitive
>> copy of your uniqueKey for searching.
>>
>> if, for some reason, you really want case insensitve *updates* (so a doc
>> with id "foo" overwrites a doc with id "FOO" then the only reliable way to
>> make something like that work is to do the lowercassing in an
>> UpdateProcessor to ensure it happens *before* the docs are distributed to
>> the correct shard, and so the correct existing doc is overwritten (even if
>> you aren't using solr cloud)
>>
>>
>>
>> -Hoss
>> http://www.lucidworks.com/
>>
>>
>
>
> ---
> Ce courrier électronique ne contient aucun virus ou logiciel malveillant
> parce que la protection avast! Antivirus est active.
> http://www.avast.com
>

Mime
View raw message