apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Atri Sharma <atri.j...@gmail.com>
Subject Re: Modules support in Apex
Date Fri, 04 Sep 2015 08:07:43 GMT
Please find JIRA:

https://malhar.atlassian.net/browse/APEX-95

On Fri, Sep 4, 2015 at 2:43 AM, Amol Kekre <amol@datatorrent.com> wrote:

> Atri,
> Dynamic changes were in the plans. We missed putting it in open source
> jira. Please open a jira in APEX for dynamic changes to DAG.
>
> Thks,
> Amol
>
> On Thu, Sep 3, 2015 at 1:09 PM, Atri Sharma <atri.jiit@gmail.com> wrote:
>
> > > On Thu, Sep 3, 2015 at 11:55 AM, Atri Sharma <atri.jiit@gmail.com>
> > wrote:
> > >
> > > > Thanks for the detailed explanation!
> > > >
> > > > I have few small questions. Please find them inline.
> > > >
> > > > On 3 Sep 2015 21:03, "Amol Kekre" <amol@datatorrent.com> wrote:
> > > >
> > > > > Design time -> design, try, iterate => Re-use of IP saves a
lot of
> > time
> > > > > here. Operators help as leaf level IP, Modules help as higher level
> > IP
> > > > that
> > > > > can be made by combining previously tested leaf level IP.
> > Additionally
> > > > they
> > > > > should be (a) correct by construction due to leveraging other
> > operators
> > > > > (b) allow properties, attributes to make re-use IP log more
> powerful.
> > > > > Modules do not prohibit Apex engine from optimizing DAG during
> launch
> > (or
> > > > > even run) time, all the while enabling re-use at higher level. In
> > fact
> > > > they
> > > > > may aid optimization by giving specific hints at a sub-DAG level.
> > Same as
> > > > > functions, templates, etc. in C++. This phase is mainly human time.
> > This
> > > > > results in a "logical plan" of App DAG
> > > > >
> > > >
> > > > So essentially a logical plan is the one manually defined by the
> users
> > > > using operators and modules, correct?
> > > >
> > >
> > > Yes.
> >
> > Thanks!
> >
> > >
> > > >
> > > > (I understand logical plan is generated from JSON files, just want to
> > > > clarify if the JSON files are the one defined by user to define the
> > DAG).
> > > >
> > > > > Launch time ->  Usually a one time cost. This is always completely
> > > > > automated in any compiler (aka compile time). So stuffing more here
> > moves
> > > > > us from human time to computer time. Module moves a lot more work
> > from
> > > > > human time to computer time as compared to leaf level operator.
> This
> > > > > results in physical plan of App DAG
> > > >
> > > > If I understand correctly,  this is where we expand Modules I. E.
> call
> > > > Module's populateDAG method, correct?
> > > >
> > > >
> > > Yes
> >
> > Thanks!
> >
> > >
> > >
> > > > > Run time  -> Actual app run time. At this stage the flattening
has
> > > > already
> > > > > happened. Apex apps are to run for-ever or for a long time. These
> are
> > not
> > > > 1
> > > > > min apps. So an extra 10 seconds during launch time in a big data
> > project
> > > > > is amortized away over a days, months, or at least a few hours.
> > > >
> > > > Got that, thanks!
> > > >
> > > > > Operators do that at leaf level, modules enable true distributed
> > > > execution.
> > > >
> > > > Is the distributed nature due to reuse?
> > > >
> > > >
> > > Distributed nature is because Apex engine can now take this dag (called
> > > physical plan), get resources from RM, and create an execution plan
> > (where
> > > each operator runs). Apex engine then distributes it. By flattening the
> > > module in logical plan to physical plan phase, we leverage Apex
> engine's
> > > in-built distributive execution. The logical plan -> physical plan ->
> > > execution plan is very common across a lot of distributed engines.
> > >
> > Got that, thanks.
> > >
> > > > > I am hoping to see this thesis proved by making Calcite integration
> > > > easier.
> > > >
> > > > I totally undrstand what Modules are meant to be and find them
> > fascinating.
> > > > I have a couple of questions around their usage please:
> > > >
> > > > 1) Is there any difference between a user using modules for reusing
> > static
> > > > DAGs vs expansion of Modules at runtime?
> > > > 2) After Modules are introduced, will APEX-3's objective be to have
> > dynamic
> > > > flattening of Modules at runtime?
> > > >
> > > >
> > > Apex-3 simply enables launch time ability, i.e static DAG. The
> objective
> > is
> > > to get dynamic DAG changes at module level. Currently they are at
> > operator
> > > level. Their is no jira for dynamic module DAG changes akaik. But we
> did
> > > talk about it before Apex incubated.
> >
> > I think my attack plans have been around dynamic DAG changes since that
> is
> > what I understood from our discussion offline earlier.  I shall continue
> > working with Vlad in APEX 3 and shall open new JIRA for dynamic changes
> if
> > it is ok.
> > >
> > >
> > > > >
> > > > > On Thu, Sep 3, 2015 at 1:12 AM, Atri Sharma <atri@apache.org>
> wrote:
> > > > >
> > > > > > Amol.
> > > > > >
> > > > > > For my understanding, when you mention launch time/code
> generation
> > > > time,
> > > > > > are you referring to generation of physical plan, please?
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Atri
> > > > > >
> > > > > > On Thu, Sep 3, 2015 at 12:48 PM, Amol Kekre <
> amol@datatorrent.com>
> > > > wrote:
> > > > > >
> > > > > > > Atri,
> > > > > > > For a lot of operations module should be treated as a black
> box.
> > It
> > > > is
> > > > > > just
> > > > > > > another reusable IP. The flattening should happen at launch
> time.
> > > > > > >
> > > > > > > If we think of Apex as a compiler, then all the compile
time
> > checks
> > > > > > (ports
> > > > > > > connectivity, matching types/schema, properties, attributes,
> ...)
> > are
> > > > as
> > > > > > > applicable to modules as to operators. At launch time (aka
code
> > > > > > generation
> > > > > > > time) module gets flattened. Webservice should still enable
> > access
> > > > via
> > > > > > > module scope on a running app.
> > > > > > >
> > > > > > > Thks,
> > > > > > > Amol
> > > > > > >
> > > > >
> > > >
> >
>



-- 
Regards,

Atri
*l'apprenant*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message