lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2599) FieldCopy Update Processor
Date Wed, 23 May 2012 01:36:41 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281347#comment-13281347
] 

Hoss Man commented on SOLR-2599:
--------------------------------

Jan:

I did not incorporate any sort of copy field equivalent in the SOLR-2802 work, but i did implement
the "append" logic as a processor (see below)

Comments on your patch...

* my personal pref would be to use a slight diff name... (maybe "CloneFieldUpdateProcessor"
?) to help differentiate slightly from {{<copyField/>}} and reduce the likelihood of
confusion during casual discussion in email/irc (ie: "I'm copying field A to B..."; "wait,
are you FieldCopy-ing or CopyField-ing?")
* as mentioned in SOLR-2825 + SOLR-3095, you shouldn't need to explicitly handle "enabled"
in the individual processors
* i would eliminate the append, append.delim, and multiValued options and only support the
multiValued=true behavior - if they want the append logic they can combine this processor
with the ConcatFieldUpdateProcessorFactory
* instead of a "move=true" boolean config, i think it would be more clear what the behavior/alternatives
are if we used an "action=clone|rename" config, with the default being "clone"
* instead of the simple whitespace seperated "source" field name config, it would be nice
if we could reuse the field name selector syntax options from FieldMutatingUpdateProcessorFactory
(multiple fieldName, fieldRegex, typeName, and typeClass as well as excludes of any/all of
those)
* need to think carefully about how maxChars should work:
** what if the source values aren't Strings? they could easily be numbers or dates, so it
seems like a bad idea to convert them to strings just because they are copied/renamed.
** even if all we worry about is strings, should it be maxChars per value, maxChars per source
field, or total maxChars in dest?
*** specifics need documented
** personally: i would suggest ripping out the maxChars option and making it a distinct processor
that can be configured later in the chain.  if we leave it in, then i think it's really important
that it should be ignored or throw and error unless the value implements CharSequence, and
not forcably toString() every copied value. (so this processor will still be useful with numeric
values)
* need to think carefully about field boosts:
** either we should try to preserve/combine them on move/copy, or we should make sure we explicitly
blow them away
** either way we need to document it
** if i'm reading the patch correctly it currently obliterates the boost on the dest field
in all cases, even if there is not source value(s) to copy, and ignores any boost on any source
field, but we should double check that.
                
> FieldCopy Update Processor
> --------------------------
>
>                 Key: SOLR-2599
>                 URL: https://issues.apache.org/jira/browse/SOLR-2599
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-2599.patch, SOLR-2599.patch
>
>
> Need an UpdateProcessor which can copy and move fields

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message