lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Solr User <solr...@gmail.com>
Subject Re: Dismax - Boosting
Date Thu, 18 Nov 2010 18:05:59 GMT
Ahmet,

I modified the schema as follows: (Added more fields for faceting)


<field name="title" type="text" indexed="true" stored="true"
omitNorms="true" />

<field name="author" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="authortype" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="isbn13" type="text" indexed="true" stored="true" />

<field name="isbn10" type="text" indexed="true" stored="true" />

<field name="material" type="text" indexed="true" stored="true" />

<field name="pubdate" type="text" indexed="true" stored="true" />

<field name="pubyear" type="text" indexed="true" stored="true" />

<field name="reldate" type="text" indexed="false" stored="true" />

<field name="format" type="text" indexed="true" stored="true" />

<field name="pages" type="text" indexed="false" stored="true" />

<field name="desc" type="text" indexed="true" stored="true" />

<field name="series" type="text" indexed="true" stored="true" />

<field name="season" type="text" indexed="true" stored="true" />

<field name="imprint" type="text" indexed="true" stored="true" />

<field name="bisacsub" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="bisacstatus" type="text" indexed="false" stored="true" />

<field name="category" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="award" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="age" type="text" indexed="true" stored="true" />

<field name="reading" type="text" indexed="true" stored="true" />

<field name="grade" type="text" indexed="true" stored="true" />

<field name="path" type="text" indexed="false" stored="true" />

<field name="shortdesc" type="text" indexed="true" stored="true" />

<field name="subtitle" type="text" indexed="true" stored="true"
omitNorms="true"/>

<field name="price" type="float" indexed="true" stored="true"/>

<field name="author_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="pubyear_facet" type="string" indexed="true" stored="true"
multiValued="true" omitNorms="true"/>

<field name="format_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="series_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="season_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="imprint_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="category_facet" type="string" indexed="true" stored="true"
multiValued="true" omitNorms="true"/>

<field name="award_facet" type="string" indexed="true" stored="true"
multiValued="true" omitNorms="true"/>

<field name="age_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="reading_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="grade_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="price_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

Also added Copy Fields as below:


<copyField source="author" dest="author_facet"/>

<copyField source="pubyear" dest="pubyear_facet"/>

<copyField source="format" dest="format_facet"/>

<copyField source="series" dest="series_facet"/>

<copyField source="season" dest="season_facet"/>

<copyField source="imprint" dest="imprint_facet"/>

<copyField source="category" dest="category_facet"/>

<copyField source="award" dest="award_facet"/>

<copyField source="age" dest="age_facet"/>

<copyField source="reading" dest="reading_facet"/>

<copyField source="grade" dest="grade_facet"/>

<copyField source="price" dest="price_facet"/>
With the above changes I am not getting any facet data as a result.

Why is that the facet data not returning and what mistake I did with the
schema?

Thanks,
Solr User

On Wed, Nov 17, 2010 at 6:42 PM, Ahmet Arslan <iorixxx@yahoo.com> wrote:

>
>
> Wow you facet on many fields :
>
> author,pubyear,format,series,season,imprint,category,award,age,reading,grade,price
>
> The fields you facet on should be untokenized type: string, int, tint date
> etc.
>
> The fields you want full text search, e.g. the ones you specify in qf, pf
> parameter should be text type.
> (title subtitle authordesc shortdesc imprint category isbn13 isbn10 format
> series season bisacsub award)
>
> If you have common fields, for example category, you need two copy of that.
> one string one text. So that you can both full-text search and facet on.
> Use copy field for this.
>
> <copyField source="category" dest="category_string"/>
>
> Example document:
> category: electronic devices
>
>
> query electronic will return it, and facets on category_string will be
> displayed as :
>
> electronic devices (1)
>
> not :
>
> electronic (1)
> devices (1)
>
>
>
> --- On Wed, 11/17/10, Solr User <solrnew@gmail.com> wrote:
>
> > From: Solr User <solrnew@gmail.com>
> > Subject: Re: Dismax - Boosting
> > To: solr-user@lucene.apache.org
> > Date: Wednesday, November 17, 2010, 11:31 PM
>  > Ahmet,
> >
> > Thanks for the reply and it was very helpful.
> >
> > The query that I used before changing to dismax was:
> >
> >
> /solr/tradecore/spell/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true
> >
> > The above query use to return all the data related to
> > facets, data and also
> > any suggestions related to spelling mistakes properly.
> >
> > The configuration after modifying using dismax is as
> > below:
> >
> > Schema.xml:
> >
> >    <field name="title" type="text"
> > indexed="true" stored="true"
> > omitNorms="true" />
> >    <field name="author" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="authortype" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="isbn13" type="text"
> > indexed="true" stored="true" />
> >    <field name="isbn10" type="text"
> > indexed="true" stored="true" />
> >    <field name="material" type="text"
> > indexed="true" stored="true" />
> >    <field name="pubdate" type="text"
> > indexed="true" stored="true" />
> >    <field name="pubyear" type="text"
> > indexed="true" stored="true" />
> >    <field name="reldate" type="text"
> > indexed="false" stored="true" />
> >    <field name="format" type="text"
> > indexed="true" stored="true" />
> >    <field name="pages" type="text"
> > indexed="false" stored="true" />
> >    <field name="desc" type="text"
> > indexed="true" stored="true" />
> >    <field name="series" type="text"
> > indexed="true" stored="true" />
> >    <field name="season" type="text"
> > indexed="true" stored="true" />
> >    <field name="imprint" type="text"
> > indexed="true" stored="true" />
> >    <field name="bisacsub" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="bisacstatus" type="text"
> > indexed="false" stored="true" />
> >    <field name="category" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="award" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="age" type="text"
> > indexed="true" stored="true" />
> >    <field name="reading" type="text"
> > indexed="true" stored="true" />
> >    <field name="grade" type="text"
> > indexed="true" stored="true" />
> >    <field name="path" type="text"
> > indexed="false" stored="true" />
> >    <field name="shortdesc" type="text"
> > indexed="true" stored="true" />
> >    <field name="subtitle" type="text"
> > indexed="true" stored="true"
> > omitNorms="true"/>
> >    <field name="price"  type="float"
> > indexed="true" stored="true"/>
> >
> > SolrConfig.xml:
> >
> >   <requestHandler name="dismax"
> > class="solr.SearchHandler" default="true">
> >     <lst name="defaults">
> >      <str
> > name="defType">dismax</str>
> >      <str
> > name="echoParams">explicit</str>
> >      <!-- <float
> > name="tie">0.01</float> -->
> >      <str name="qf">
> >         title^9.0 subtitle^3.0
> > author^1.0 desc shortdesc imprint category
> > isbn13 isbn10 format series season bisacsub award
> >      </str>
> >      <!--
> > <str name="pf">
> >         text^0.2 features^1.1 name^1.5
> > manu^1.4 manu_exact^1.9
> >      </str>
> >      <str name="bf">
> >         popularity^0.5
> > recip(price,1,1000,1000)^0.3
> >      </str>
> > -->
> >      <str name="fl">
> >         *
> >      </str>
> > <!--
> >      <str name="mm">
> >         2<-1 5<-2
> > 6<90%
>  >      </str>
> >      <int
> > name="ps">100</int>
> >      <str
> > name="q.alt">*:*</str>
> > -->
> >      <!-- example highlighter
> > config, enable per-query with hl=true -->
> > <!--
> >      <str name="hl.fl">text
> > features name</str>
> > -->
> >      <!-- for this field, we want no
> > fragmenting, just highlighting -->
> > <!--
> >      <str
> > name="f.name.hl.fragsize">0</str>
> > -->
> >      <!-- instructs Solr to return
> > the field itself if no query terms are
> >           found -->
> > <!--
> >      <str
> > name="f.name.hl.alternateField">name</str>
> >      <str
> > name="f.text.hl.fragmenter">regex</str>
> > -->
> >      <!-- defined below -->
> >     </lst>
> >   </requestHandler>
> >
> > The query that I used after changing to dismax is:
> >
> >
> solr/tradecore/select/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true
> >
> >
> > The following are the issues that I am having after
> > modifying to dismax:
> >
> > 1. Facets data is not coming correctly. Lot of extra data
> > is coming. Why and
> > how to fix it?
> > 2. How to use spell checker request handler along with
> > dismax?
> >
> > Thanks,
> > Murali
> >
> > On Mon, Nov 15, 2010 at 5:38 PM, Ahmet Arslan <iorixxx@yahoo.com>
> > wrote:
> >
> > > > 1. Do we need to change the above DisMax handler
> > > > configuration as per our
> > > > requirements? Or Leave it as it is? What
> > changes?
> > >
> > > Yes, you need to edit it. At least field names. Does
> > your schema has a
> > > field named sku?
> > >
> > > > 2. Do we need make DisMax as a default request
> > > > handler?  Do I need to add
> > > > attribute default="true" to the tag?
> > >
> > > If you are going to always use it, why not, change it
> > by adding
> > > default="true". By doing so you need to add qt
> > parameter in every request.
> > > But don't forget to delete other default="true". There
> > can be only one
> > > default="true" :)
> > >
> > > > 3. I read in the documentation that Default
> > Search Handler
> > > > and DisMax are the same except that to use
> > DisMaxQueryParser add
> > > > defType=dismax in the query string. Is there
> > anything else do we need to
> > > > do?
> > >
> > > Above dismax config contains default parameter list.
> > So you don't need to
> > > add &defType=dismax&qf=title^1.0 text^1.5 ...
> > etc. to the query string.
> > >
> > >
> > > > We are basically moving on to dismax handler and
> > trying to
> > > > understand what
> > > > changes we need to make to SolrConfig.xml.
> > >
> > > As you can see in default solrconfig.xml, you can
> > register multiple
> > > instances of solr.SearchHandler with different default
> > parameter list and
> > > name. default="true" one is executed by default.
> > >
> > > And this can be helpful deciding about dismax params:
> > qf,pf,ps,ps,mm etc
> > > http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/
> > >
> > >
> > >
> > >
> >
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message