lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com.INVALID>
Subject Re: Priority in search an synonyms
Date Thu, 11 Dec 2014 11:55:09 GMT
Hi Antoine,

By saying "The problem I have now is that ebc_libelle synonyms reported for the field are
not show", you mean you have synonym entry for the word Castaroma, and documents containing
those synonym entries do not show up in fist 100 documents?

If yes, play with boost values (5 versus 75), tweak them until you have satisfactory diverse
result set.

By the way, I think filling first/initial result set (whenever possible) with exact matches
is a good thing.
I believe, the user types her query for some reason. If exact matched documents are too few,
then other techniques (stemming, synonym, etc) should kick in. Please note that this approach
makes sense for search applications where precision is more valuable than recall.

Ahmet





On Thursday, December 11, 2014 12:20 PM, Antoine REBOUL <antoine.reboul@plebicom.com>
wrote:
Hello,

First of all thank you for your answers !

In my schema.xml file:
- I created this field :
    <fieldType name="tmp_libelle" class="solr.TextField"
positionIncrementGap="100" >
        <analyzer type="index"> <tokenizer
class="solr.StandardTokenizerFactory"/></analyzer>
        <analyzer type="query"><tokenizer
class="solr.StandardTokenizerFactory"/></analyzer>
    </fieldType>
- the type of this field is a "copyfiled" :
    <field name="tmp_libelle" type="tmp_libelle" indexed="true"
stored="true" required="false"/>
    <copyField source="ebc_libelle" dest="tmp_libelle"/>

I wonder if the following statement is required :
<defaultSearchField>ebc_libelle</defaultSearchField>

I test my results with the following settings :
http://IP:8983/solr/select/?qf=tmp_libelle
^75%20ebc_libelle^5&pf=ebc_libelle&q=Castorama&start=0&rows=100&indent=on&defType=edismax&sort=score%20asc

The problem I have now is that ebc_libelle synonyms reported for the field
are not show


The field ebc_libelle is analyzed/indexed as follows :
   <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.ISOLatin1AccentFilterFactory"/>
                <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"/>
                <filter class="solr.ElisionFilterFactory"
articles="elisions.txt"/>
                <filter class="solr.SynonymFilterFactory"
synonyms="synonyms2.txt" ignoreCase="true" expand="true"/>
                <filter class="solr.ASCIIFoldingFilterFactory"/>
                <filter class="solr.WordDelimiterFilterFactory"
                        generateWordParts="1"
                        generateNumberParts="1"
                        catenateWords="1"
                        catenateNumbers="1"
                        catenateAll="1"
                        splitOnCaseChange="1"
                        splitOnNumerics="1"
                        preserveOriginal="1"   />
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        </analyzer>
        <analyzer type="query">
                <filter class="solr.ISOLatin1AccentFilterFactory"/>
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.WordDelimiterFilterFactory"
                        generateWordParts="1"
                        generateNumberParts="1"
                        catenateWords="1"
                        catenateNumbers="0"
                        catenateAll="1"
                        splitOnCaseChange="1"
                        preserveOriginal="1"/>
                <filter class="solr.StopFilterFactory"
                        ignoreCase="true"
                        words="stopwords.txt"
                        enablePositionIncrements="true"/>
                <filter class="solr.ElisionFilterFactory"
articles="elisions.txt"/>
                <filter class="solr.ASCIIFoldingFilterFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.SynonymFilterFactory"
synonyms="synonyms2.txt" ignoreCase="true" expand="true"/>
                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>



Best Regards.

*Antoine Reboul*
Responsable Comparateurs / Plateforme emailing
Plebicom -  eBuyClub - Cashstore - Checkdeal

PLEBICOM – 29 avenue Joannes Masset – 69009 Lyon
Tel  : 04 72 85 81 49
Fax : 04 78 83 39 74


2014-12-10 16:40 GMT+01:00 Alexandre Rafalovitch <arafalov@gmail.com>:

> This might be written just for you:
>
> http://opensourceconnections.com/blog/2014/12/08/title-search-when-relevancy-is-only-skin-deep/
>
> Merchant would be same as title = short text
>
> Regards,
>    Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On 10 December 2014 at 10:00, Antoine REBOUL
> <antoine.reboul@plebicom.com> wrote:
> > hello,
> >
> > I have a question , I do not know if there is a solution ...
> >
> > I will index and search a field named " Libel " .
> > I use a " synomims " file.
> >
> > I have for example the following line in my file synonyms " ipad = >
> Apple,
> > Priceminister , Amazon"
> >
> > Research on iPad gives me much Apple, and Amazon Priceminister ( expected
> > result)
> > But when I am searching "Apple", i want that the merchant Apple is
> returned
> > first.
> > This is not the case , in fact, it is Amazon who gets the first place.
> >
> > Sorry for my poor English , I'm using a translator.
> >
> > Best Regards.
> >
> > *Antoine Reboul*
> > Responsable Comparateurs / Plateforme emailing
> > Plebicom -  eBuyClub - Cashstore - Checkdeal
> >
> > PLEBICOM – 29 avenue Joannes Masset – 69009 Lyon
> > Tel  : 04 72 85 81 49
> > Fax : 04 78 83 39 74
>

Mime
View raw message