lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Rowe <sar...@gmail.com>
Subject Re: accent insensitive field-type
Date Thu, 02 Jul 2015 17:42:07 GMT
See https://issues.apache.org/jira/browse/SOLR-7749

> On Jul 2, 2015, at 8:31 AM, Steve Rowe <sarowe@gmail.com> wrote:
> 
> Hi Søren,
> 
> “charFilter” should be “charFilters”, and “filter” should be “filters”;
and both their values should be arrays - try this:
> 
> {
>  "add-field-type”: {
>    "name":"myTxtField",
>    "class":"solr.TextField",
>    "positionIncrementGap":"100",
>    "analyzer”: {
>      "charFilters": [ {"class":"solr.MappingCharFilterFactory", "mapping":"mapping-ISOLatin1Accent.txt”}
],
>      "tokenizer": [ {"class":"solr.StandardTokenizerFactory”} ],
>      "filters": {"class":"solr.LowerCaseFilterFactory"}
>    }
>  }
> }
> 
> There should be better error messages for misspellings here.  I’ll file a JIRA issue.
> 
> (I also moved “filters” after “tokenizer” since that’s the order in which they’re
executed in an analysis pipeline, but Solr will interpret the out-of-order version correctly.)
> 
> FYI, if you want to *correct* a field type, rather than create a new one, you should
use the “replace-field-type” command instead of the “add-field-type” command.  You’ll
get an error if you attempt to add a field type that already exists in the schema.
> 
> Steve
> 
>> On Jul 2, 2015, at 1:17 AM, Søren <sd@syntonetic.com> wrote:
>> 
>> Hi Solr users
>> 
>> I'm new to Solr and I need to be able to search in structured data in a case and
accent insensitive manner. E.g. find "Crème brûlée", both when quering with "Crème brûlée"
and "creme brulee".
>> 
>> It seems that none of the build-in text types support this, or am I wrong?
>> So I try to add my own inspired by another post, although it was old.
>> 
>> I'm running solr-5.2.1.
>> 
>> Curl to http://localhost:8983/solr/mycore/schema
>> {
>> "add-field-type":{
>>    "name":"myTxtField",
>>    "class":"solr.TextField",
>>    "positionIncrementGap":"100",
>>    "analyzer":{
>>       "charFilter": {"class":"solr.MappingCharFilterFactory", "mapping":"mapping-ISOLatin1Accent.txt"},
>>       "filter": {"class":"solr.LowerCaseFilterFactory"},
>>       "tokenizer": {"class":"solr.StandardTokenizerFactory"}
>>       }
>>   }
>> }
>> 
>> But it doesn't work and when I look in '[... ]\solr-5.2.1\server\solr\mycore\conf\managed-schema'
>> the analyzer section is reduced to this:
>> <fieldType name="myTxtField" class="solr.TextField" positionIncrementGap="100">
>>   <analyzer>
>>     <tokenizer class="solr.StandardTokenizerFactory"/>
>>   </analyzer>
>> </fieldType>
>> 
>> I'm I almost there or am I on a completely wrong track?
>> 
>> Thanks in advance
>> Søren
>> 
> 


Mime
View raw message