lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Albela Miranda <alb...@3.14financial.com>
Subject Re: Searching with accents
Date Thu, 01 Feb 2007 17:00:03 GMT
Thorsten Scherler wrote:
> On Thu, 2007-02-01 at 16:35 +0100, Manuel Albela Miranda wrote:
>   
>> Thorsten Scherler wrote:
>>     
>>> On Thu, 2007-02-01 at 12:37 +0100, Manuel Albela Miranda wrote:
>>>   
>>>       
>>>> Hello everybody,
>>>>
>>>> Do you know if there is a way to search with and without accents without

>>>>   duplicate a field?.
>>>>
>>>> I have a large index (60Gb) and don't want to have two fields with the 
>>>> same content one with accents and the other one without them because 
>>>> this field is the biggest in the index.
>>>>
>>>> Again, hope you can help me.
>>>>     
>>>>         
>>> Try something like this in your schema.xml:
>>> <fieldtype name="stringSimilar" class="solr.TextField"
>>> positionIncrementGap="100">
>>>       <analyzer type="index">
>>>         <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>>>         <filter class="solr.ISOLatin1AccentFilterFactory"/>
>>>       </analyzer>
>>>       <analyzer type="query">
>>>         <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>>>         <filter class="solr.ISOLatin1AccentFilterFactory"/>
>>>       </analyzer>
>>>     </fieldtype>
>>>
>>> HTH
>>>
>>> salu2
>>>
>>>   
>>>       
>>>> Thank you very much.
>>>>
>>>> Regards.
>>>>
>>>> Manu
>>>>
>>>>     
>>>>         
>> Hi Thorsten,
>>
>> First of all, thank you for your message. I've working around the 
>> schema.xml file with the lines you sent me. Now i can filter the query, 
>> but the problem is that i have accents in my index so, when i search for 
>> words with accents, solr only search for the word without them and i 
>> need both of them. I don't know if there is a way to do this.
>>     
>
> Well, it is not nice but you could use fuzzy search.
>
> AKA q=Órden~075
>
> That will find more matches. See recent threads around fuzzy search.
>
> The above schema patch is working nice if you update your index (index
> everything again), but what you would need is to reindex the WHOLE 60Gb.
>
> salu2
>
>   
Yes, i was considering that, but there is a problem. If i remove the 
accents into the index, when i get the results of a search they will not 
have those accents so results will no be good enough.

I have to see the performance of the fuzzy search, but i don't think it 
would work for me.

Thank you again.

Regards.

Manu.

Mime
View raw message