lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-2338) improved per-field similarity integration into schema.xml
Date Wed, 09 Feb 2011 21:33:57 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992728#comment-12992728
] 

Hoss Man commented on SOLR-2338:
--------------------------------

Most existing situations where plugins are dereferenced by name are so we can reuse the exact
same object instance (ie: for recording stats, or because they are heavyweight to construct
on the fly)

in the case of similarity, the main advantage i can think of would be if we wanted true per-field
similiarity declaration, not just per field type ie...

{code}
<similarity name="S_XX" class=...></similarity>
<similarity name="S_YY" class=...></similarity>
...
<fieldType name="FT_AA"> 
  <analyzer>...</analyzer>
  <similarity name="S_XX"/>
</fieldType>
...
<field name="F_111" type="FT_AA" /><!-- implied S_XX -->
<field name="F_222" type="FT_AA" similarity="S_YY" />
{code}

...but even if we don't do that, i suppose it's also conceivable that someone might have their
own Similarity implementation that is expensive to instantiate (ie: maintains some big in
memory data structures?) and might want to be able to declare one instance and then refer
to it by name in many different fieldType declarations.

I think for now just supporting the first example yonik cited...

{code}
<fieldType>
  <analyzer>...</analyzer>
  <similarity class=...></similarity>
</fieldType>
{code}

would be a huge win, and we can always enhance to add name derefrencing later.

> improved per-field similarity integration into schema.xml
> ---------------------------------------------------------
>
>                 Key: SOLR-2338
>                 URL: https://issues.apache.org/jira/browse/SOLR-2338
>             Project: Solr
>          Issue Type: Improvement
>          Components: Schema and Analysis
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>
> Currently since LUCENE-2236, we can enable Similarity per-field, but in schema.xml there
is only a 'global' factory
> for the SimilarityProvider.
> In my opinion this is too low-level because to customize Similarity on a per-field basis,
you have to set your own
> CustomSimilarityProvider with <similarity class=.../> and manage the per-field
mapping yourself in java code.
> Instead I think it would be better if you just specify the Similarity in the FieldType,
like after <analyzer>.
> As far as the example, one idea from LUCENE-1360 was to make a "short_text" or "metadata_text"
used by the
> various metadata fields in the example that has better norm quantization for its shortness...

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message