lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-5084) new field type - EnumField
Date Wed, 28 Aug 2013 18:18:52 GMT

    [ https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752659#comment-13752659
] 

Hoss Man commented on SOLR-5084:
--------------------------------

bq. And I still would really like it if we didn't need a separate XML file for each enumerated
type: its like a parallel schema.xml: I think it would be much better if we could nest this
underneath the fieldtype.

it would be nice, but as far as i know there is no way for a FieldType to do this -- making
this FieldType use an attribute to refer to another file (just like ExternalFile field does,
or StopWordsFilterFactory, or SynonymFilterFactory, etc...) seems like a suitable approach
for now, and if/when someone enhances FieldType configuration in general, then it can be revisted.
 (ie: it doesn't seem fair to Elran to object to this patch/feature given that he's working
iwth the APIs available)

bq. Finally, I still think the ordinals should be implicit in the list (as i mentioned before).
This way the thing can actually be efficient.

I agree that it makes sense to require that the ordinals be "dense" (ie: start at 0, no gaps
allowed).

But in my opinion, from a usability standpoint, I think it's actually better to force the
Solr admin writing the config to explicit about the numeric mappings in the config so that
they *have* to be aware of the fact that a specific numeric value is used under the covers
(ie: in hte indexed/docValues fields) for each value that the end users get.  It seems like
it will help minimize the risk of someone assuming that only the "labels" matter in the configs
and the can insert new ones to get the sorting they want.

Example:

If the config looked like this...

{noformat}
<enum name="priority">
  <value>LOW</value>
  <value>HIGH</value>
</enum>
{noformat}

...then a user might not realize there is anything wrong with making the following additions
w/o re-indexing...

{noformat}
<enum name="priority">
  <value>NONE</value>
  <value>LOW</value>
  <value>MEDIUM</value>
  <value>HIGH</value>
</enum>
{noformat}

...and if they did that they would silently get bogus results -- no obvious error at runtime.

As long as the config forces them to be explicit about the values (and has error checking
at startup that the values start a "0" and are monotomicly increasing ints) then anyone who
wants to "insert" values into their config is going to have to pause and think about the fact
that there is a concrete int associated with the existing values -- and is more likely to
realize that changing those ints has consequences.

                
> new field type - EnumField
> --------------------------
>
>                 Key: SOLR-5084
>                 URL: https://issues.apache.org/jira/browse/SOLR-5084
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Elran Dvir
>         Attachments: enumsConfig.xml, schema_example.xml, Solr-5084.patch, Solr-5084.patch,
Solr-5084.patch, Solr-5084.patch
>
>
> We have encountered a use case in our system where we have a few fields (Severity. Risk
etc) with a closed set of values, where the sort order for these values is pre-determined
but not lexicographic (Critical is higher than High). Generically this is very close to how
enums work.
> To implement, I have prototyped a new type of field: EnumField where the inputs are a
closed predefined  set of strings in a special configuration file (similar to currency.xml).
> The code is based on 4.2.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message