brooklyn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Heneveld <alex.henev...@cloudsoftcorp.com>
Subject Re: [PROPOSAL] merging config keys
Date Thu, 26 May 2016 00:39:14 GMT

so ConfigInheritance.MERGE means to a deep merge, in order to handle 
templateOptions.  and if/when needed we can put in dsl support to 
prevent that.

btw if parent has `{key: {some: map}}` and i specify `key: null` in the 
child will that block a merge?  whereas `key: {}` at the child will 
result in `{key: {some: map}}`?

i think that's reasonable (and in a pick you could solve the overwrite 
map problem with an intermediate ancestor setting null)

--a


On 25/05/2016 18:20, Andrew Kennedy wrote:
> Hi,
>
> Adding my +1 for this proposal. It should make blueprint inheritance much
> more useful.
>
> I think Aled is right that 'merge' is going to be the most common use case,
> and optimising for that seems like the right idea. Then, overwrite (i.e.
> replacing the entire map) and deletion of individual keys can be
> implemented using the more complex DSL syntax. Regarding the aside, that's
> an interesting point. There's a few ways of treating it that might be
> possible to include as part of this functionality. For example, providing a
> base environment map, to which separate maps for install/customize/launch
> commands have their own maps that are then merged (or overwritten)
> individually, would be useful, as well as just being able to supply
> different map configuration. Some of this can be handled using YAML
> templating functionality, defining the base map and including it in various
> places, but that seems a bit messy; intrinsic syntax that is used
> explicitly for handling maps of data would be better.
>
> Another thing to think about when dealing with environment maps would be
> the ability to auto-populate it with a set of attributes (with some mapping
> from the dot-separated lowercase sensor name to an underscore-separated
> uppercase name) and their current values. This is a common pattern in many
> blueprints, so a simple bit of DSL syntax would be handy.
>
> The templateOptions handling does seem more complicated, but again,
> optimising for the common case is the right thing to do, at least
> initially. So, merging the leaf nodes of the tree of maps. This also
> applies to location inheritance, as well, but I guess that will be handled
> automatically if the logic happens in all `Configurable` brooklyn objects?
> Actually, there will be lists as well as maps, and maps of maps (as well as
> lists of maps, maps of maps of maps, and maps of lists and so on...!)
> structures in the entity config as well, so making merge work in the
> general case of leaf nodes in an arbitrary tree structure is going to be
> required anyway.
>
> Basically, if we follow the principle of least surprise when implementing,
> it should also be reasonably intuitive.
>
> Andrew.
>
> On Wed, 25 May 2016 at 21:56 Aled Sage <aled.sage@gmail.com> wrote:
>
>> Alex, all,
>>
>> I wondered about (1) as well. I concluded that we should optimise for
>> what I believe is the most common case for these config keys: being able
>> to add additional values and override specific values. For example, if
>> someone defines a Tomcat entity with default environment variables, then
>> I want to be able to add to those and override specific values easily.
>>
>> The alternative approach you mentioned (below) is technically elegant,
>> but feels overly complicated YAML for the common case.
>>
>>       shell.env:
>>         $brooklyn:merge:
>>         - overridden_key: value
>>           new_key: value
>>
>> If folk agree that the common case to optimise for is the "merge" for
>> config keys like shell.env, then we could limit the less-obvious
>> approach for the overwrite use-case:
>>
>>       shell.env:
>>         $brooklyn:overwrite:
>>         - new_key1: value
>>           new_key2: value
>>
>> /(As an aside, at some point I'd like to completely revisit shell.env:
>> we should better support supplying different environment variables to
>> each of the install/launch/checkRunning scripts.)/
>>
>> ---
>> For (2), I lean towards treating templateOptions (within
>> provisioningProperties) as a special case. Really we need a major
>> overhaul of our JcloudsLocation code.
>>
>> My long-term ideal would be that we don't need to put those key-values
>> within a templateOptions sub-map. That is really a consequence of our
>> implementation.
>>
>> I think merging of templateOptions is the behaviour that a user
>> (unfamiliar with the underlying implementation) would expect. Until one
>> explains about how this maps to the jclouds TemplateOptions class, there
>> is little logic for what is a top-level key-value and what goes inside
>> templateOptions. So I think a user would want them both merged.
>>
>> Aled
>>
>>
>> On 25/05/2016 17:07, Alex Heneveld wrote:
>>> two difficulties i see:
>>>
>>> 1) how do i clear inherited map values on a MERGEd config?
>>> 2) how do we specify that "templateOptions" in a
>>> "provisioningProperties" is to be merged?
>>>
>>> (followed by conclusion -- outlining an alternative but overall unsure)
>>>
>>>
>>> *1) how do i clear inherited map values on a MERGEd config?*
>>>
>>> e.g. say we have
>>>
>>> parent with
>>>
>>>      shell.env: { X: 1 }
>>>
>>> and child wants to ensure shell.env has *nothing* for X. previously
>>> child could just say
>>>
>>>      shell.env: {}
>>>
>>> however with this proposal i think the child now requires:
>>>
>>>      shell.env: { X: null }
>>>
>>> listing every key it inherits and hoping that the shell to-string
>>> excludes nulls.
>>>
>>>
>>> *2) how do we specify that "templateOptions" in a
>>> "provisioningProperties" is to be merged?*
>>>
>>> for this proposal we'll write code to understand ConfigInheritance at
>>> entities and locations.  i don't think we yet have discussed any way
>>> to do this one level deeper.  specifically if i've got
>>>
>>>      # parent
>>>      provisioningProperties:
>>>        templateOptions:
>>>          floatingIpPoolNames: pool
>>>
>>>      # child
>>>      type: parent
>>>      provisioningProperties:
>>>        templateOptions:
>>>          networks: xyz
>>>
>>> we know from the definition of PROVISIONING_PROPERTIES on
>>> SoftwareProcess that that map should be merged with its super. however
>>> when merging the actual provisioning properties we have no way to
>>> understand the semantics, do we?  specifically there is no indication
>>> that PROVISIONING_PROPERTIES is:
>>> * a map of config
>>> * usually containing config keys from JcloudsLocationConfig.
>>> without that knowledge we can't do the "depth 2" merge illustrated in
>>> the example, can we?  in other words we'll lose "floatingIpPoolNames:
>>> pool".
>>>
>>>
>>> *conclusion*
>>>
>>> these two issues aren't showstoppers but they are a little bit
>>> smelly.  apart from them the proposal is very good:  it solves an
>>> irritation around maps in a fairly simple elegant way and only
>>> impacting opt-in config-keys in a cleanly defined way.
>>>
>>> using $brooklyn:super() with a proposed $brooklyn:merge is an
>>> alternative solution which lets us solve (2) and avoids both of these
>>> issues:
>>>
>>>      shell.env:
>>>        $brooklyn:merge:
>>>        - $brooklyn:super()
>>>        - overridden_key: value
>>>          new_key: value
>>>
>>> this could also work for lists.  however it requires the user to
>>> explicitly write this, it's uglier, and it might be hard to implement.
>>>
>>> if we introduced a `$brooklyn:overwrite` we could combine aled's
>>> proposal with dsl solutions to problems (1) and (2) described here.
>>> but it makes behaviours more complicated.
>>>
>>> in short not yet sure what is best...
>>>
>>> --a
>>>
>>>
>>>
>>> On 25/05/2016 07:36, Geoff Macartney wrote:
>>>> +1
>>>>
>>>> This sounds like a good proposal.  At the same time it’s fairly
>>>> complex, so I think an important part of the change for this would be
>>>>
>>>> 1. to test it comprehensively, so each of the scenarios below would
>>>> require at least one test case, and then
>>>> 2. to document it equally comprehensively - a new subsection could be
>>>> added in the User Manual under YAML blueprints, with content taken
>>>> from the email below and beefed up for general readership
>>>>
>>>> At the moment I don’t think the documentation is comprehensive enough
>>>> about all these details (as they work today), this could be a good
>>>> opportunity to improve it.
>>>>
>>>> cheers
>>>> Geoff
>>>>
>>>>
>>>> ————————————————————
>>>> Gnu PGP key - http://is.gd/TTTTuI
>>>>
>>>>
>>>>> On 25 May 2016, at 12:44, Svetoslav Neykov
>>>>> <svetoslav.neykov@cloudsoftcorp.com> wrote:
>>>>>
>>>>> +1 for the proposal, definitely makes sense.
>>>>>
>>>>> One thing that's not clear to me is how deep the merge should be.
>>>>> Having templateOptions as an example I think it should be a shallow
>>>>> merge. Can't think of deep complex structures passed in yaml that
>>>>> would favour deep merge.
>>>>>
>>>>> Re generalizing "$brooklyn:super()" - could have it as a string key
>>>>> in maps that we want to merge. That is the owner of the map that's
>>>>> doing the override can define whether he prefers merge or override.
>>>>> It makes sense when developing blueprints because you know what the
>>>>> catalog items being inherited are and can decide which way to go.
>>>>>
>>>>> Svet.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On 25.05.2016 г., at 14:12, Aled Sage <aled.sage@gmail.com>
wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> TL;DR: we should merge config when overriding entities/locations,
>>>>>> where it's obvious that such behaviour is desired. For example,
>>>>>> where an entity type defines shell.env, then a new entity extending
>>>>>> this type should inherit and add to those values.
>>>>>>
>>>>>>
>>>>>> _*REQUIREMENTS*_
>>>>>>
>>>>>> _*shell.env in entities*_
>>>>>>
>>>>>> When extending an existing entity type in YAML, it is not possible
>>>>>> to extend the set of environment variables. Instead, if the
>>>>>> sub-type declares shell.env it will override the inherited values.
>>>>>>
>>>>>> For example, consider the catalog items below:
>>>>>>
>>>>>>    # Catalog
>>>>>>    brooklyn.catalog:
>>>>>>       items:
>>>>>>       - id: machine-with-env
>>>>>>         item:
>>>>>>           type:
>>>>>> org.apache.brooklyn.entity.software.base.VanillaSoftwareProcess
>>>>>>           brooklyn.config:
>>>>>>             shell.env:
>>>>>>               ENV1: myEnv1
>>>>>>
>>>>>>
>>>>>>    # Blueprint
>>>>>>    location: ...
>>>>>>    services:
>>>>>>    - type: machine-with-env
>>>>>>       brooklyn.config:
>>>>>>         shell.env:
>>>>>>           ENV2: myEnv2
>>>>>>         launch.command: echo "ENV1=$ENV1, ENV2=$ENV2"
>>>>>>
>>>>>> A user might well expect the launch.command to have myEnv1 and
>>>>>> myEnv2. However, it does not get the ENV1 environment variable.
>>>>>> This is a real pain when trying to customize stock blueprints.
>>>>>>
>>>>>> We propose that the shell.env map should be *merged*.
>>>>>>
>>>>>>
>>>>>> _*provisioning.properties*_
>>>>>>
>>>>>> An entity can be configured with provisioning.properties. These are
>>>>>> passed to the location when obtaining a new machine. They
>>>>>> supplement and override the values configured on the location.
>>>>>> However, for templateOptions the expected/desired behaviour would
>>>>>> be to merge the options.
>>>>>>
>>>>>> Consider the blueprint below:_*
>>>>>> *_
>>>>>>
>>>>>>    location:
>>>>>>       minCores: 1
>>>>>>       templateOptions:
>>>>>>         networks: myNetwork
>>>>>>    services:
>>>>>>    - type: org.apache.brooklyn.entity.machine.MachineEntity
>>>>>>       brooklyn.config:
>>>>>>         provisioning.properties:
>>>>>>           minRam: 2G
>>>>>>           templateOptions:
>>>>>>             tags: myTag
>>>>>>
>>>>>> A user might well expect the VM to be created with the given
>>>>>> networks and tags. However, currently the templateOptions in
>>>>>> provisoining.properties will override the existing value, rather
>>>>>> than being merged with it.
>>>>>>
>>>>>> We propose that the templateOptions map should be *merged*.
>>>>>>
>>>>>> Valentin made a start to fix this in
>>>>>> https://github.com/apache/brooklyn-server/pull/151.
>>>>>>
>>>>>>
>>>>>> _*_*provisioning.properties in sub-entities*_
>>>>>> *_
>>>>>>
>>>>>> A similar argument holds for when extending an entity-type in YAML.
>>>>>>
>>>>>> If the super-type declares template options, then any additional
>>>>>> provisioning.properties declared on the entity sub-type should be
>>>>>> *merged* (including merging the templateOptions map contained
>>>>>> within it).
>>>>>>
>>>>>>
>>>>>> _*files.preinstall, templates.preinstall, etc*_
>>>>>>
>>>>>> The same applies for the map config for: files.preinstall,
>>>>>> templates.preinstall, files.install, templates.install,
>>>>>> files.runtime and templates.runtime.
>>>>>>
>>>>>> We propose that these maps get *merged* with the value defined in
>>>>>> the super-type.
>>>>>>
>>>>>>
>>>>>> _*Overriding default values*_
>>>>>>
>>>>>> For default values in the super-type, we propose that this value
>>>>>> *does* get overridden, rather than merged.
>>>>>>
>>>>>> For example, in the blueprint below we suggest that the
>>>>>> launch-command in the sub-type should have ENV2 but not
>>>>>> ENV_IN_DEFAULT.
>>>>>>
>>>>>>    brooklyn.catalog:
>>>>>>       items:
>>>>>>       - id: machine-with-env
>>>>>>         version: 1.0.0
>>>>>>         item:
>>>>>>           type:
>>>>>> org.apache.brooklyn.entity.software.base.VanillaSoftwareProcess
>>>>>>           brooklyn.parameters:
>>>>>>           - name: shell.env
>>>>>>             default:
>>>>>>               ENV_IN_DEFAULT: myEnvInDefault
>>>>>>       - id: machine-with-env-2
>>>>>>         version: 1.0.0
>>>>>>         item:
>>>>>>           type: machine-with-env
>>>>>>           brooklyn.config:
>>>>>>             shell.env:
>>>>>>               ENV2: myEnv2
>>>>>>             launch.command: echo "ENV_IN_DEFAULT=$ENV_IN_DEFAULT,
>>>>>>    ENV2=$ENV2"
>>>>>>
>>>>>> (Interestingly, the current behaviour of machine-with-env is that
>>>>>> it gets the value for ENV_IN_DEFAULT but not for ENV2, so sometime
>>>>>> strange is going on with re-defining the shell.env config key!)
>>>>>>
>>>>>>
>>>>>> _*Extending commands: deferred*_
>>>>>>
>>>>>> Another scenario is where a super-type declares a value for
>>>>>> `install.command`, and the sub-type wants to augment this by adding
>>>>>> additional commands. Currently that is not possible. Instead the
>>>>>> sub-type needs to use pre.install.command and/or
>>>>>> post.install.command. But that leads to the same problem if a
>>>>>> super-type also has a value defined for that key.
>>>>>>
>>>>>> Svet suggested we could perhaps introduce something like
>>>>>> $brooklyn:super().
>>>>>>
>>>>>> Unless we can generalise that approach to also solve the merging
of
>>>>>> `shell.env` etc, then I suggest we defer the `install.command`
>>>>>> use-case. That can be proposed and discussed in a different thread.
>>>>>>
>>>>>> However, if we can solve these problems with clever explicit use
of
>>>>>> $brooklyn:super(), then that could provide an elegant solution to
>>>>>> all of these problems!
>>>>>>
>>>>>>
>>>>>> _*Inheritance from parent entities*_
>>>>>>
>>>>>> Things are made yet more complicated by the fact we inherit config
>>>>>> from parent entities, in the entity hierarchy.
>>>>>>
>>>>>> We propose that this behaviour is also configurable for the config
>>>>>> key, but that the defaults stay as they are. The existing logic is
>>>>>> applied to find the config value that applies to the given entity.
>>>>>> That value is then merged with its super-type, as appropriate.
>>>>>>
>>>>>> For example, in the blueprint below... machine1 would get ENV1 and
>>>>>> ENV2 (i.e. the ENV1 definition overrides the ENV_IN_APP
>>>>>> definition). However, machine2 would get ENV1 and ENV_IN_APP (i.e.
>>>>>> it inherits ENV_IN_APP from the parent, and this is meged with the
>>>>>> super-type).
>>>>>>
>>>>>>    services:
>>>>>>    - type: org.apache.brooklyn.entity.stock.BasicApplication
>>>>>>       brooklyn.config:
>>>>>>         shell.env:
>>>>>>           ENV_IN_APP: myEnvInApp
>>>>>>       brooklyn.children:
>>>>>>       - type: machine-with-env
>>>>>>         id: machine1
>>>>>>         brooklyn.config:
>>>>>>           shell.env:
>>>>>>             ENV2: myEnv2
>>>>>>       - type: machine-with-env
>>>>>>         id: machine2
>>>>>>
>>>>>> The reasoning behind this is to figure out the inheritance/override
>>>>>> rules incrementally. We leave the parent-inheritance as-is, and
>>>>>> just focus on the sub-typing inheritance.
>>>>>>
>>>>>> Note that there is already a ConfigInheritance defined on ConfigKey
>>>>>> for controlling this kind of inheritance from the parent. The legal
>>>>>> values for ConfigInheritance are currently just ALWAYS and NONE.
>>>>>>
>>>>>>
>>>>>> _*IMPLEMENTATION*_
>>>>>>
>>>>>> Clearly we do not want to implement this piecemeal. We'll add a way
>>>>>> to declare that a config key should be merged with that value from
>>>>>> the super-type.
>>>>>>
>>>>>> We'll change the Java ConfigKey code to be:
>>>>>>
>>>>>>    public interface ConfigKey {
>>>>>>       /**
>>>>>>        * @since 0.10.0
>>>>>>        */
>>>>>>       @Nullable ConfigInheritance getParentInheritance();
>>>>>>
>>>>>>       /**
>>>>>>        * @since 0.10.0
>>>>>>        */
>>>>>>    @Nullable ConfigInheritance getTypeInheritance();
>>>>>>
>>>>>>       /**
>>>>>>        * @deprecated since 0.10.0; instead use {@link
>>>>>>    #getParentInheritance()}
>>>>>>        */
>>>>>>       @Nullable ConfigInheritance getInheritance();
>>>>>>    }
>>>>>>
>>>>>> We'll add to ConfigInheritance support for MERGE. We'll change the
>>>>>> name "ALWAYS" to OVERRIDE (deprecating the old value).
>>>>>>
>>>>>> We'll change EntityConfigMap.getConfig to handle this new merge
>>>>>> behaviour. And same for locations, policies and enrichers.
>>>>>>
>>>>>> Aled
>>>>>>
>>>>>>
>>>
>> --
> Andrew Kennedy ; Founder clocker.io project ; @grkvlt ; Cloudsoft
>


Mime
View raw message