apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Atri Sharma <atri.j...@gmail.com>
Subject Re: Modules support in Apex
Date Thu, 03 Sep 2015 18:55:24 GMT
Thanks for the detailed explanation!

I have few small questions. Please find them inline.

On 3 Sep 2015 21:03, "Amol Kekre" <amol@datatorrent.com> wrote:

> Design time -> design, try, iterate => Re-use of IP saves a lot of time
> here. Operators help as leaf level IP, Modules help as higher level IP
that
> can be made by combining previously tested leaf level IP. Additionally
they
> should be (a) correct by construction due to leveraging other operators
> (b) allow properties, attributes to make re-use IP log more powerful.
> Modules do not prohibit Apex engine from optimizing DAG during launch (or
> even run) time, all the while enabling re-use at higher level. In fact
they
> may aid optimization by giving specific hints at a sub-DAG level. Same as
> functions, templates, etc. in C++. This phase is mainly human time. This
> results in a "logical plan" of App DAG
>

So essentially a logical plan is the one manually defined by the users
using operators and modules, correct?

(I understand logical plan is generated from JSON files, just want to
clarify if the JSON files are the one defined by user to define the DAG).

> Launch time ->  Usually a one time cost. This is always completely
> automated in any compiler (aka compile time). So stuffing more here moves
> us from human time to computer time. Module moves a lot more work from
> human time to computer time as compared to leaf level operator. This
> results in physical plan of App DAG

If I understand correctly,  this is where we expand Modules I. E. call
Module's populateDAG method, correct?

> Run time  -> Actual app run time. At this stage the flattening has already
> happened. Apex apps are to run for-ever or for a long time. These are not
1
> min apps. So an extra 10 seconds during launch time in a big data project
> is amortized away over a days, months, or at least a few hours.

Got that, thanks!

> Operators do that at leaf level, modules enable true distributed
execution.

Is the distributed nature due to reuse?

> I am hoping to see this thesis proved by making Calcite integration
easier.

I totally undrstand what Modules are meant to be and find them fascinating.
I have a couple of questions around their usage please:

1) Is there any difference between a user using modules for reusing static
DAGs vs expansion of Modules at runtime?
2) After Modules are introduced, will APEX-3's objective be to have dynamic
flattening of Modules at runtime?

>
> On Thu, Sep 3, 2015 at 1:12 AM, Atri Sharma <atri@apache.org> wrote:
>
> > Amol.
> >
> > For my understanding, when you mention launch time/code generation time,
> > are you referring to generation of physical plan, please?
> >
> > Regards,
> >
> > Atri
> >
> > On Thu, Sep 3, 2015 at 12:48 PM, Amol Kekre <amol@datatorrent.com>
wrote:
> >
> > > Atri,
> > > For a lot of operations module should be treated as a black box. It is
> > just
> > > another reusable IP. The flattening should happen at launch time.
> > >
> > > If we think of Apex as a compiler, then all the compile time checks
> > (ports
> > > connectivity, matching types/schema, properties, attributes, ...) are
as
> > > applicable to modules as to operators. At launch time (aka code
> > generation
> > > time) module gets flattened. Webservice should still enable access via
> > > module scope on a running app.
> > >
> > > Thks,
> > > Amol
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message