heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neng Lu <freen...@gmail.com>
Subject Re: Specifying Operator Resource in DSL
Date Fri, 22 Sep 2017 06:33:42 GMT
@bill As far as I know, people are still doing the same thing with
summingbird. And absolutely `Yes`, we should allow some kind of mechanism
to specify resources for operators.

@sanjeev Thx for the clarification. Based on my experience with Twitter
topologies, users would like to specify resources for simple operators if
the topology is really critical to them. So I personally prefer a generic
enough way that user can specify resources for any operator.



On Thu, Sep 21, 2017 at 16:05 Sanjeev Kulkarni <sanjeevrk@gmail.com> wrote:

> Neng,
> https://github.com/twitter/heron/pull/2334
> provides this abstraction.
> The issue however is the follows. In Spout/Bolt world, every component is
> explicitly named by the topology writer and thus all resources can be
> specified on a per component basis. However in the dsl world, a) the
> operators themselves dont have name and b) optimizations can squish the
> operators into single physical operator. One possibility would be to add a
> name optionally to the operator(like map(mapfn, name), but that seems too
> cumbersome/kludgy)
>
> On Thu, Sep 21, 2017 at 3:57 PM, Neng Lu <freeneng@gmail.com> wrote:
>
> > Just add some thoughts here: for ordinary heron topologies, the
> definition
> > of a heron job and the request of resources usage for each component are
> > separated: `TopologyBuilder` for job definition, `Config` for resource
> > requirement.
> >
> > In the dsl case, if we could also do something similar that separates the
> > dsl job creation and resources request, it would be really good. With
> this
> > separation, people has the flexibility of providing different configs for
> > the same job.
> >
> >
> > On Wed, Sep 20, 2017 at 1:48 PM, Sanjeev Kulkarni <sanjeevrk@gmail.com>
> > wrote:
> >
> > > Hi folks,
> > > One of the great features of the lower level spout/bolt interface in
> > Heron
> > > is the ability to specify resources needed on a per component basis.
> This
> > > feature is very helpful for tuning large topologies and is heavily used
> > > inside Twitter.
> > > Currently the DSL does not have this flexibility. I wanted to get
> > opinions
> > > about how we can add this.
> > > There are probably several ways to do it. I'm listing a few approaches
> > that
> > > have come to my mind. Please feel free to add more.
> > > 1) Currently some of our operators are simple(like flatMap, map, filter
> > > operators), others are a little complicated(like transform where users
> > can
> > > perform setup/cleanup). We can take the approach of adding the ability
> to
> > > specify resources only for complex operators. Thus transform could have
> > two
> > > variants. The current one which just takes a transform function and
> > another
> > > that takes in a resource parameter as well. The rest of other
> > > operators(map/flatmap/filter, etc) will remain the same. The advantage
> of
> > > this is that the interface explosion is minimal and controlled. The
> cons
> > is
> > > that if you need to control the resources of a particular operator, you
> > are
> > > forced to use transform.
> > > 2) Another approach would be to add a variant that takes in a Resource
> > > parameter to all operators. Pros is that this gives fine grained
> control
> > to
> > > all operators. Cons is the interface blow up.
> > >
> > > Thoughts?
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message