lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From esra <esraer...@gmail.com>
Subject Re: lucene farsi problem
Date Wed, 30 Apr 2008 12:06:53 GMT

Hi,
thanks for your reply.
I am using StandartAnalyzer now and my xml document is like below:

<keyword><![CDATA[ساب ووفر]]></keyword>
      <description><![CDATA[یک ووفر که در محفظه ای جدا از سایر
درایور ها
قرار دارد تا صدایی با باس فوق العاده پایین تولید کند.
]]></description>
    
i googled for farsi analyzer and found nothing also i am not sure it if
would solve my problem or not.

Thanks,

Esra


Grant Ingersoll-6 wrote:
> 
> What Analyzer are you using?  You might try looking in Luke to see  
> what is in your index, etc.  It also isn't clear to me what your  
> documents look like.
> 
> As for a Farsi analyzer, I would Google "Farsi analyzer Lucene" and  
> see if you can find anything.  Otherwise, you will have to write your  
> own (and donate it????)
> 
> -Grant
> 
> On Apr 30, 2008, at 3:21 AM, esra wrote:
> 
>>
>> hi,
>>
>> i am using lucene's "IndexSearcher" to search the given xml by  
>> keyword which
>> contains farsi information.
>> while searching i use ranges like
>>
>> آ-ث  |  ج-خ  |  د-ژ  |  س-ظ  |  ع-ق  |  ک-ل  |  م-ی
>>
>> when i do search for  "د-ژ"  range the results are wrong , they are  
>> the
>> results of  " س-ظ "range.
>>
>> for example when i do search for "د-ژ"  one of the the results is  
>> "ساب ووفر"
>> , this result also shown on the " س-ظ " range's result list which  
>> is the
>> corret range.
>>
>> As IndexSearcher use "compareTo" method and this method uses  
>> unicodes for
>> comparing, i found the unicodes of the characters.
>>
>> د=U+62F
>> ژ = U+698
>> and the first letter of "ساب ووفر " is  س = U+633
>>
>> Do you have any idea how to solve this problem, there are analyzers  
>> for
>> different languages ,
>> will this be usefull if so do you know where to find a farsi analyzer?
>>
>> I would bu glad if you help.
>>
>> thanks ,
>>
>> Esra
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/lucene-farsi-problem-tp16977096p16977096.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> --------------------------
> Grant Ingersoll
> 
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
> 
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/lucene-farsi-problem-tp16977096p16980977.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message