lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <>
Subject [jira] [Updated] (SOLR-1365) Add configurable Sweetspot Similarity factory
Date Wed, 27 Feb 2013 00:28:13 GMT


Hoss Man updated SOLR-1365:

    Attachment: SOLR-1365.patch

Ok, here's an all new patch for the post SOLR-2338 world order.

Example syntax...

    <!-- using baseline TF -->
    <fieldType name="text_baseline" class="solr.TextField"
               indexed="true" stored="false">
      <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
      <similarity class="solr.SweetSpotSimilarityFactory">
        <!-- TF -->
        <float name="baselineTfMin">6.0</float>
        <float name="baselineTfBase">1.5</float>
        <!-- plateau norm -->
        <int name="lengthNormMin">3</int>
        <int name="lengthNormMax">5</int>
        <float name="lengthNormSteepness">0.5</float>
    <!-- using hyperbolic TF -->
    <fieldType name="text_hyperbolic" class="solr.TextField"
               indexed="true" stored="false" >
      <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
      <similarity class="solr.SweetSpotSimilarityFactory">
        <float name="hyperbolicTfMin">3.3</float>
        <float name="hyperbolicTfMax">7.7</float>
        <double name="hyperbolicTfBase">2.718281828459045</double> <!-- e -->
        <float name="hyperbolicTfOffset">5.0</float>
        <!-- plateau norm, shallower slope -->
        <int name="lengthNormMin">1</int>
        <int name="lengthNormMax">5</int>
        <float name="lengthNormSteepness">0.2</float>

(it automatically detects wether to use hyperbolic or baseline tf depending on which settings
are used)

Anyone have any concerns?
> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>                 Key: SOLR-1365
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 4.2, 5.0
>         Attachments: SOLR-1365.patch, SOLR-1365.patch
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful
by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable
SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters
from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom
SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message