apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Pujare <san...@datatorrent.com>
Subject Re: Visitor API for DAG
Date Thu, 08 Dec 2016 22:03:08 GMT
Thinking more about it, it makes sense to have generic hooks for post-launch use cases so I
support the idea of making it generic.

On 12/7/16, 10:09 AM, "Sanjay Pujare" <sanjay@datatorrent.com> wrote:

    Should we continue the discussion in the JIRA?
    
    Making it generic the way you are suggesting – won’t it cause problems of consistency
etc e.g. different hooks do different things? Also are there a use cases for such generic
hooks to justify the effort involved?
    
    On 12/7/16, 7:29 AM, "Pramod Immaneni" <pramod@datatorrent.com> wrote:
    
        Tushar,
        
        Why specifically limit it to client side preparation of the DAG before the
        application is launched? Why not make it possible to have general hooks
        that can apply even when the application is running in the different
        distributed components of the application such as the containers and stram.
        The hooks could be registered to be asynchronous or synchronous. The
        asynchronous ones could be handled via a bus, we already have a bus
        called mbassador that we use today and it could potentially be used for
        this. The entire functionality is akin to something like dtrace without the
        dynamic part.
        
        Thanks
        
        On Fri, Nov 25, 2016 at 3:24 AM, Tushar Gosavi <tushar@datatorrent.com>
        wrote:
        
        > Opened a Jira https://issues.apache.org/jira/browse/APEXCORE-577 for this.
        >
        > - Tushar.
        >
        >
        > On Mon, Nov 21, 2016 at 9:59 PM, Amol Kekre <amol@datatorrent.com> wrote:
        > > Ananth,
        > > The current API allows changing properties of a running app. The new
        > > proposed API is not needed to do so.
        > >
        > > Thks
        > > Amol
        > >
        > >
        > > On Sun, Nov 20, 2016 at 10:43 PM, Tushar Gosavi <tushar@datatorrent.com>
        > > wrote:
        > >
        > >> Hi Ananth,
        > >>
        > >> We can not change runtime properties through this API. The current
        > >> flow of apex application execution is
        > >>
        > >> 1) StramClient prepares application, inject properties from
        > >> properties.xml / user provided xml files and validates dag
        > >> 2) StramClient copies required jars and serialized plan to HDFS and
        > >> launch master container.
        > >> 3) Application master reads serialised plan from HDFS and starts
        > >> deploying StramClient as per deployment plan.
        > >>
        > >> The visitor will examine the DAG in stage 1, hence visitor can only
        > >> change the initial state. The execution of the application is not
        > >> affected.
        > >>
        > >> - Tushar.
        > >>
        > >>
        > >> On Sat, Nov 19, 2016 at 3:40 AM, ananth <ananthg.apex@gmail.com>
wrote:
        > >> > How does this work for the stateful operators ? Can we use this
to
        > >> override
        > >> > properties that are deserialized ?
        > >> >
        > >> > Regards,
        > >> >
        > >> > Ananth
        > >> >
        > >> >
        > >> >
        > >> > On 18/11/16 05:53, Tushar Gosavi wrote:
        > >> >>
        > >> >> The code will execute before application master is launched,
it is
        > >> >> just one time activity during application startup. Few use
cases I
        > >> >> could think are
        > >> >>
        > >> >> - Operator validation/configuration validator
        > >> >>    jdbc operator could check if database is accessible with
given
        > >> >> credentials.
        > >> >>    file output operator could if directory exists and filesystem
is
        > >> >> writable.
        > >> >>
        > >> >> - Injection of properties in operators from external sources.
        > >> >>
        > >> >> - If two operator wants to exchange some information based
on
        > >> >> configuration, they could do it through visitor. for example
        > >> >> TUPLE_SCHEMA can be set on downstream operator port based on
        > operators
        > >> >> input TUPLE_SCHEMA and its configuration (for example projection
        > >> >> operator which drops few columns, could create a new class
with fewer
        > >> >> fields and set it as tuple schema on downstream operator port).
        > >> >>
        > >> >> - For pojo enabled operator (port where TUPLE_SCHEMA is defined),
a
        > >> >> efficient stream codec could be written using asm library for
        > >> >> serialisation and use that as stream codec instead of default
one.
        > >> >>
        > >> >> -Tushar.
        > >> >>
        > >> >>
        > >> >> On Thu, Nov 17, 2016 at 11:35 PM, Sanjay Pujare <
        > sanjay@datatorrent.com
        > >> >
        > >> >> wrote:
        > >> >>>
        > >> >>> There is a risk if the user written code blocks the thread
or
        > crashes
        > >> the
        > >> >>> process. What are the real life examples of this use case?
        > >> >>>
        > >> >>>
        > >> >>> On 11/17/16, 9:21 AM, "amol kekre" <amolhkekre@gmail.com>
wrote:
        > >> >>>
        > >> >>>      +1. Opening up the API for users to put in their own
code is
        > good.
        > >> >>> In
        > >> >>>      general we should enable users to register their code
in a lot
        > of
        > >> >>> scenerios.
        > >> >>>
        > >> >>>      Thks
        > >> >>>      Amol
        > >> >>>
        > >> >>>      On Thu, Nov 17, 2016 at 9:06 AM, Tushar Gosavi
        > >> >>> <tushar@datatorrent.com>
        > >> >>>      wrote:
        > >> >>>
        > >> >>>      > Yes, It could happen after current DAG validation
and before
        > the
        > >> >>>      > application master is launched.
        > >> >>>      >
        > >> >>>      > - Tushar.
        > >> >>>      >
        > >> >>>      >
        > >> >>>      > On Thu, Nov 17, 2016 at 8:32 PM, Munagala Ramanath
        > >> >>> <ram@datatorrent.com>
        > >> >>>      > wrote:
        > >> >>>      > > When would the visits happen ? Just before
normal
        > validation ?
        > >> >>>      > >
        > >> >>>      > > Ram
        > >> >>>      > >
        > >> >>>      > > On Wed, Nov 16, 2016 at 9:50 PM, Tushar
Gosavi
        > >> >>> <tushar@apache.org>
        > >> >>>      > wrote:
        > >> >>>      > >
        > >> >>>      > >> Hi All,
        > >> >>>      > >>
        > >> >>>      > >> How about adding visitor like API for
DAG in Apex, and an
        > api
        > >> >>> to
        > >> >>>      > >> register visitor for the DAG.
        > >> >>>      > >> Possible use cases are
        > >> >>>      > >> -  Validator visitor which could validate
the dag
        > >> >>>      > >> -  Visitor to inject properties/attribute
in the
        > >> >>> operator/streams from
        > >> >>>      > >> some external sources.
        > >> >>>      > >> -  Platform does not support validation
of individual
        > >> >>> operators.
        > >> >>>      > >> developer could write a validator visitor
which would call
        > >> >>> validate
        > >> >>>      > >> function of operator if it implements
Validator interface.
        > >> >>>      > >> - generate output schema based on operator
config and
        > input
        > >> >>> schema,
        > >> >>>      > >> and set the schema on output stream.
        > >> >>>      > >>
        > >> >>>      > >> Sample API :
        > >> >>>      > >>
        > >> >>>      > >> dag.registerVisitor(DAGVisitor visitor);
        > >> >>>      > >>
        > >> >>>      > >> Call order of visitorFunctions.
        > >> >>>      > >> - preVisitDAG(Attributes) // dag attributes
        > >> >>>      > >>   for all operators
        > >> >>>      > >>   - visitOperator(OperatorMeta meta)
// access to
        > operator,
        > >> >>> name,
        > >> >>>      > >> attributes, properties
        > >> >>>      > >>  ports
        > >> >>>      > >>   - visitStream(StreamMeta meta) //
access to
        > >> >>>      > >> stream/name/attributes/properties/ports
        > >> >>>      > >> - postVisitDAG()
        > >> >>>      > >>
        > >> >>>      > >> Regards,
        > >> >>>      > >> -Tushar.
        > >> >>>      > >>
        > >> >>>      >
        > >> >>>
        > >> >>>
        > >> >>>
        > >> >
        > >>
        >
        
    
    
    



Mime
View raw message