heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Fischer <j...@joshfischer.io>
Subject Re: Proposing Changes To ECO
Date Mon, 22 Jan 2018 21:51:35 GMT
Ning,

In my email I was thinking specifically of setting the componentRam.  This
is case the value is a comma delimited string value which would be easy to
incorrectly format the list of values to be appended.   An image to
reference is below.  So by passing in a list of values, I could then
correctly format the value String as we would expect.

public static void setComponentRam(Map<String, Object> conf,
                                   String component, ByteAmount ramInBytes) {
  if (conf.containsKey(Config.TOPOLOGY_COMPONENT_RAMMAP)) {
    String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_RAMMAP);
    String newEntry = String.format("%s,%s:%d", oldEntry, component,
ramInBytes.asBytes());
    conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
  } else {
    String newEntry = String.format("%s:%d", component, ramInBytes.asBytes());
    conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
  }
}

I'm glad you sent this email as it got me thinking about the above spec
that Karthik mentioned.  I've copied his spec below


config:
  topology.workers: 2
  topology.component.resourcemap:

    - id: "component-1"
      ram: 1234MB
      cpu: 0.5
      disk: 123MB

   - id: "component-2"
     ram: 2345MB
     cpu: 0.75
     disk: 4GB

I think disk and cpu resources are allocated at a topology level and would
not be applicable here.  Unless there is a way that you specify this
through the Heron Config class?..  After looking at the docs here
https://twitter.github.io/heron/docs/developers/tuning/ and looking at the
Heron Config class, I don't see way to specify these at a component level.
I do see there is a way to pass any configuration up to Heron, can I set
this values via a `prepare()` or `open()` call?

One last note while thinking about this.  `setComponentJvmOptions()` has a
similar behavior.  I would have this do the same for this field too I
believe


public static void setComponentJvmOptions(
    Map<String, Object> conf,
    String component,
    String jvmOptions) {
  String optsBase64;
  String componentBase64;

  optsBase64 = DatatypeConverter.printBase64Binary(
      jvmOptions.getBytes(StandardCharsets.UTF_8));
  componentBase64 = DatatypeConverter.printBase64Binary(
      component.getBytes(StandardCharsets.UTF_8));

  String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_JVMOPTS);
  String newEntry;
  if (oldEntry == null) {
    newEntry = String.format("{\"%s\":\"%s\"}", componentBase64, optsBase64);
  } else {
    // To remove the '{' at the start and '}' at the end
    oldEntry = oldEntry.substring(1, oldEntry.length() - 1);
    newEntry = String.format("{%s,\"%s\":\"%s\"}", oldEntry,
componentBase64, optsBase64);
  }
  // Format for TOPOLOGY_COMPONENT_JVMOPTS would be a json map like this:
  //  {
  //     "componentNameAInBase64": "jvmOptionsInBase64",
  //     "componentNameBInBase64": "jvmOptionsInBase64"
  //  }
  conf.put(Config.TOPOLOGY_COMPONENT_JVMOPTS, newEntry);

}



If I've missed something please let me know.

-Josh


On Mon, Jan 22, 2018 at 12:02 PM, Ning Wang <wangninggm@gmail.com> wrote:

> LGTM. And I like the 123MB  more than separating value and unit into two
> settings.
>
> Quick questions:
> This new config will replace the existing topology.component.rammap?
> "the way ECO handles topology configuration will not work for all
> configuration types". Can you give a more specific example?
>
> Thanks.
>
>
>
>
>
> On Mon, Jan 22, 2018 at 9:33 AM, Karthik Ramasamy <karthik@streaml.io>
> wrote:
>
> > Josh -
> >
> > One more feedback - since the resources assigned can be CPU, RAM, DISK -
> > instead of calling it
> >
> > topology.component.rammap
> >
> > can we call it
> >
> > topology.component.resourcemap
> >
> > and allow for CPU and DISK. Furthermore, we append the size type into the
> > metric as follows
> >
> > config:
> >   topology.workers: 2
> >   topology.component.resourcemap:
> >
> >     - id: "component-1"
> >       ram: 1234MB
> >       cpu: 0.5
> >       disk: 123MB
> >
> >    - id: "component-2"
> >      ram: 2345MB
> >      cpu: 0.75
> >      disk: 4GB
> >
> > This will make it easier to read and also flexible, thoughts?
> >
> > cheers
> > /karthik
> >
> >
> >
> > cheers
> > /karthik
> >
> > On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <josh@joshfischer.io>
> wrote:
> >
> > > To All,
> > >
> > > I think I made  a mistake in my previous email
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >     - "some-id": 1234
> > >     - "other-id": 6789
> > >
> > >
> > > I think the yaml above is incorrect as well as other examples.  I think
> > we
> > > would have to do something like below
> > >
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >     - "some-id:1234"
> > >     - "other-id:6789"
> > >
> > > Which would then product a list of strings that would match the way the
> > > topology_component_rammap is set via other apis.  The problem with this
> > > approach is it would be easy for someone to make a mistake within the
> > > formatting of the strings and would then cause us to have to validate
> the
> > > format to fit the specs.  I think the approach below would be better.
> I
> > > would then just take the input, do some validation and conversion via
> the
> > > ByteAmount class and generate a properly formatted string to fit the
> > specs
> > > of the topology_component_rammap values.
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >
> > >     - id: "component-1"
> > >       size: 1234
> > >       type: MB  // Megabytes
> > >
> > >     - id: "component-2"
> > >       size: 6789
> > >       type: GB // GigaBytes
> > >
> > >     - id: "component-3"
> > >       size: 123456789
> > >       type: B // Bytes
> > >
> > >
> > >
> > > Hope I was clear with trying to explain things.  Of course I will also
> be
> > > creating the docs as well to explain usage.
> > >
> > > -Josh
> > >
> > > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <josh@joshfischer.io>
> > wrote:
> > >
> > > > All,
> > > >
> > > > While working with Karthik, we have discovered that the way ECO
> handles
> > > > topology configuration will not work for all configuration types.  To
> > be
> > > > specific, setting individual component's ram will not work.  We will
> > also
> > > > have to keep in mind container size that contains the components.  My
> > > > proposal is this:
> > > >
> > > > Create a standardized way to allow for the  configuring of  component
> > ram
> > > > size in the "config" section of the  eco yaml file.  This would be a
> > list
> > > > of key value pairs that mapped the "id" of a component to an
> allocated
> > > ram
> > > > size in MB.  An example is below:
> > > >
> > > > config:
> > > >   topology.workers: 2
> > > >   topology.component.rammap:
> > > >     - "some-id": 1234
> > > >     - "other-id": 6789
> > > >
> > > > However the  above implementation may be unclear when it comes to
> > > > understanding what unit of measurement is implicitly specified and/or
> > > > expected.  Or we could do something like below.
> > > >
> > > > config:
> > > >   topology.workers: 2
> > > >   topology.component.rammap:
> > > >     - spec:
> > > >        id: "component-1"
> > > >        size: 1234
> > > >        type: MB  // Megabytes
> > > >     - spec:
> > > >        id: "component-2"
> > > >        size: 6789
> > > >        type: GB // GigaBytes
> > > >     - spec:
> > > >        id: "component-3"
> > > >        size: 123456789
> > > >        type: B // Bytes
> > > >
> > > >
> > > > If a mapping is not specified for a component, we can just assume
> > Heron's
> > > > defaults.  We could then dynamically calculate the container size
> based
> > > off
> > > > of the number components and their corresponding allocated resources
> > for
> > > > simplicity of use for the user, but still allow them to specify a
> > custom
> > > > set of resources to a container like below
> > > >
> > > > topology.container.disk: 1234
> > > > topology.container.ram: 3456
> > > > topology.container.cpu: 2
> > > >
> > > >
> > > > It may be best if I reused the ByteAmount object to calculate
> resource
> > > > size to remain consistent with the other Heron APIs.  Any concerns or
> > > > improvements to this approach I am missing?
> > > >
> > > > Please Advise,
> > > >
> > > > Josh
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message