lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mohan sundaram <mohanmc...@gmail.com>
Subject Re: Arabic words search in solr
Date Tue, 31 Jan 2017 05:58:42 GMT
Hi Steve,

Thanks for sharing the information.  I looking for your email, but you
replied on solr user community. Now I am subscribed to solr community user
list to get email.

 I went through the solr references document which you shared in the link.
Your shared references document pointing to solr version 6.4.0.

The implemented Solr version in my project is 4.9.0.


As I mentioned earlier In my solr schema.xml I defined product Arabic name
field as below:

/*----------------------------------------------*/

<field name="productNameArabic" type="text_ar" indexed="true"
stored="true"/>



<fieldType name="text_ar" class="solr.TextField"
positionIncrementGap="100">

                <analyzer>

                                <tokenizer
class="solr.StandardTokenizerFactory"/>

                                <filter
class="solr.LowerCaseFilterFactory"/>

                                <filter class="solr.StopFilterFactory"
ignoreCase="true" words="lang/stopwords_ar.txt" />

        <filter class="solr.ArabicNormalizationFilterFactory"/>

        <filter class="solr.ArabicStemFilterFactory"/>

    </analyzer>

</fieldType>

/*----------------------------------------------*/



I am indexing the Arabic content using “text_ar” field type.




*Characters*

*ا*

*أ*

*إ*

*آ*

Shift key Considers for the above

Table 1


These are the example of characters where I’m facing the searching
difficulty.




*Example Indexed words*

*ابرا*

*أبرا*

*إبرا*

*آبرا*

Table 2

These an example of indexed words in Solr.



*Searching word*

*ابرا*

Table 3


Now my problem is, By searching for the above word(table 3) I should get
all indexed words in table 2 in the output.



Is Solr version 4.9.0 compatible with Arabic search or do I need to upgrade
to higher version?


Kindly, do let me know if I need to give an example of all characters since
I gave only for one character which is hamza with alef.


Thanks,

Mohan




On Mon, Jan 30, 2017 at 9:21 PM, Steve Rowe <sarowe@gmail.com> wrote:

> Hi Mohan,
>
> I answered your question on the solr-user list.  Did you see my response?
>
> I CC’d you on this email, but you should know that Apache mailing lists
> won’t automatically send you email unless you have subscribed to the list.
> For more information, see <http://lucene.apache.org/
> solr/community.html#mailing-lists-irc>.
>
> --
> Steve
> www.lucidworks.com
>
> > On Jan 29, 2017, at 2:16 PM, mohan sundaram <mohanmca01@gmail.com>
> wrote:
> >
> > Hi,
> >
> > In solr search I want to search with product name using Arabic letters.
> > While searching, Arabic user can feel little default to search some
> product
> > name. Because some characters need to mention while searching.
> >
> > Ex: إ أ آ
> >
> >
> > In the above mentioned characters, user can get combination of shift key.
> > Usually if Arabic people will mention “ ا “  character and will get the
> > below combined words.
> >
> > Ex: إبرا
> >
> >
> > In my solr schema.xml I defined product arabic name field as below
> >
> >
> > <field name="productNameArabic" type="text_ar" indexed="true"
> > stored="true"/>
> >
> >
> >  <fieldType name="text_ar" class="solr.TextField"
> > positionIncrementGap="100">
> >
> >      <analyzer>
> >
> >        <tokenizer class="solr.StandardTokenizerFactory"/>
> >
> >        <filter class="solr.LowerCaseFilterFactory"/>
> >
> >        <filter class="solr.StopFilterFactory" ignoreCase="true"
> > words="lang/stopwords_ar.txt" />
> >
> >        <filter class="solr.ArabicNormalizationFilterFactory"/>
> >
> >        <filter class="solr.ArabicStemFilterFactory"/>
> >
> >      </analyzer>
> >
> >    </fieldType>
> >
> >
> >
> > What changes I have do in schame.xml. Please help me on this.
> >
> >
> >
> > --
> > Regards,
> > Mohan.N
> > 096896429683
>
>


-- 
Regards,
Mohan.N
9865998919

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message