lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jolinar13 <jolina...@gmail.com>
Subject Re: Very odd behaviour of FrenchAnalyzer with strings in capital letters
Date Mon, 28 May 2007 14:28:06 GMT

It looks like it remove the letter in the end, if it ends with an 'a', 'e' or
'i'.
Femelles => all:femel
Is this expected?
How to use FrenchAnalyzer?
Thanks
Florian


Jolinar13 wrote:
> 
> Some terms I tested :
> vehicle => all:vehicl
> vehiCle => all:vehicle
> Vehicle => all:vehicl
> VeHicle => all:vehicle
> VEHICLE => all:vehicle
> vehicles => all:vehicl
> paris => all:par
> :S
> 
> 
> Jolinar13 wrote:
>> 
>> Thanks to Luke, I realized my terms were not parsed correctly, and this
>> has nothing to do with upper case!
>> It seems to happen when the word ends with "*i". For example "giovanni"
>> is parsed "giovann".
>> Something about this?
>> Florian
>> 
>> 
>> Jolinar13 wrote:
>>> 
>>> Hello Mark!
>>> Thank you a lot for your answer.
>>> You are right for the Luke part. My Luke version was too old. My bad.
>>> But with Luke I still observe the problem I described.
>>> Any idea how to sort this out?
>>> Maybe this has to do with the fact I use Compass?
>>> Thank you
>>> Florian
>>> 
>>>>>>> I got strange
>>>>>>> search results on strings in uppercase. (example : VEHICLE)
>>>>>>> When I search the string (in lower case), I get no result. I
get
>>>>>>> results
>>>>>>> if
>>>>>>> I use "vehicle*" or "vehiclE", or "vehicLe" etc.
>>>>>>>
>>>>>>> What is odd is that it affects only some of the strings, not
all of
>>>>>>> them.
>>> 
>>> 
>>> markrmiller wrote:
>>>> 
>>>> FrenchAnalyzer does lowercase and using it would not in anyway alter 
>>>> Lukes ability to read your index.
>>>> 
>>>> - Mark
>>>> 
>>>> Jolinar13 wrote:
>>>>> Hello Erick,
>>>>> Still no idea about my problem?
>>>>> Anybody here using the FrenchAnalyzer?
>>>>> Thanks,
>>>>> Florian
>>>>>
>>>>>
>>>>> Jolinar13 wrote:
>>>>>   
>>>>>> Hello,
>>>>>> Thank you for your quick answer.
>>>>>> I use Luke to examine the index, but since I switched to
>>>>>> FrenchAnalyzer,
>>>>>> it says 'Not a Lucene index'.
>>>>>> If I open the index files in a text viewer, the strings are in UPPER
>>>>>> case.
>>>>>> I do use the same analyzer to index and search.
>>>>>> So, do I have to specify the FrenchAnalyzer not to be case sensitive?
>>>>>> How
>>>>>> to do that?
>>>>>> Thanks a lot
>>>>>> Florian
>>>>>>
>>>>>>
>>>>>> Erick Erickson wrote:
>>>>>>     
>>>>>>> First have you gotten a copy of Luke to examine your index to
see
>>>>>>> what's actually indexed?
>>>>>>>
>>>>>>> The default behavior is usually to lowercase everything, but
I'm not
>>>>>>> entirely sure if the French analyzer does this. But I suspect
so.
>>>>>>>
>>>>>>> Searches are case sensitive. To get caseless searching, you need
>>>>>>> to put everything in the same case. This is usually done for
you
>>>>>>> with
>>>>>>> any of the standard analyzers, but check specifically.
>>>>>>>
>>>>>>> Are you using the same analyzer at index AND search time?
>>>>>>>
>>>>>>> Best
>>>>>>> Erick
>>>>>>>
>>>>>>> On 5/21/07, Jolinar13 <jolinar13@gmail.com> wrote:
>>>>>>>       
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I tried org.apache.lucene.analysis.fr.FrenchAnalyzer and
I got
>>>>>>>> strange
>>>>>>>> search results on strings in uppercase. (example : VEHICLE)
>>>>>>>> When I search the string (in lower case), I get no result.
I get
>>>>>>>> results
>>>>>>>> if
>>>>>>>> I use "vehicle*" or "vehiclE", or "vehicLe" etc.
>>>>>>>>
>>>>>>>> What is odd is that it affects only some of the strings,
not all of
>>>>>>>> them.
>>>>>>>> Anyone who has ever experienced this problem?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Florian
>>>>>>>> --
>>>>>>>> View this message in context:
>>>>>>>> http://www.nabble.com/Very-odd-behaviour-of-FrenchAnalyzer-with-strings-in-capital-letters-tf3789153.html#a10715673
>>>>>>>> Sent from the Lucene - Java Users mailing list archive at
>>>>>>>> Nabble.com.
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>         
>>>>>>>       
>>>>>>     
>>>>>
>>>>>   
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Very-odd-behaviour-of-FrenchAnalyzer-with-strings-in-capital-letters-tf3789153.html#a10837045
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message