apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amol Kekre <a...@datatorrent.com>
Subject Re: Proposing a new feature to persist logical and physical plan snapshots in HDFS
Date Thu, 24 Nov 2016 01:17:13 GMT
Persisted plan on DFS is good. I am +1 for it. This could be both of the
following

1. Attribute : If set, then upon change in plan persist to DFS
2. On demand

Thks
Amol


On Wed, Nov 23, 2016 at 4:15 PM, Sanjay Pujare <sanjay@datatorrent.com>
wrote:

> Okay, but this “state” is gone after the app is “dead” isn’t that true?
> Also the reason for this enhancement is debuggability/troubleshooting of
> Apex apps so it is good to have separate explicit user visible files that
> contain the plan information instead of overloading the state for this
> purpose (in my opinion).
>
> In terms of on-demand, it sounds like a good idea - I didn’t think of it.
> But I would like to drill down the use cases. In most cases,
> logical/physical plan changes are spontaneous or rather internal to the app
> so an external entity making a REST call to save the plan on demand might
> not sync up with when the plan changes took place inside the app. So saving
> the plan JSON files on events described previously seems to be the most
> efficient thing to do (as discussed with @Ashwin Putta) but if there are
> use cases I think it is a good idea to do it on demand as well.
>
> On 11/23/16, 3:00 PM, "Amol Kekre" <amol@datatorrent.com> wrote:
>
>     Good idea. Stram does save state, and maybe a script that translates
> may
>     work. But explicit plan saving is also a good idea. Could this be "on
>     demand"? a rest call that writes out the plan(s) to specifid hdfs
> files?
>
>     We could do both (write on any change/set call) and/or on-demand.
>
>     Thks
>     Amol
>
>
>     On Wed, Nov 23, 2016 at 2:40 PM, Sanjay Pujare <sanjay@datatorrent.com
> >
>     wrote:
>
>     > To help Apex developers/users with debugging or troubleshooting
> “dead”
>     > applications, I am proposing a new feature to persist logical and
> physical
>     > plan snapshots in HDFS.
>     >
>     >
>     >
>     > Similar to how the Apex engine persists container data per
> application
>     > attempt in HDFS as containers_NNN.json (where NNN is 1 for first app
>     > attempt, 2 for the second app attempt and so on), we will create 2
> more
>     > sets of files under the …/apps/{appId} directory for an application:
>     >
>     >
>     >
>     > logicalPlan_NNN_MMM.json
>     >
>     > physicalPlan_NNN_MMM.json
>     >
>     >
>     >
>     > where NNN stands for the app attempt index (similar to NNN above 1,
> 2, 3
>     > and so on) and MMM is a running index starting at 1 which stands for
> a
>     > snapshot within an app attempt. Note that a logical or physical plan
> may
>     > change within an app attempt for any number of reasons.
>     >
>     >
>     >
>     > The StreamingContainerManager class maintains the current
> logical/physical
>     > plans in the “plan” member variable. New methods will be added in
>     > StreamingContainerManager to save the logical or physical plan as
> JSON
>     > representations in the app directory (as described above). The logic
> is
>     > similar to com.datatorrent.stram.webapp.StramWebServices.
> getLogicalPlan(String)
>     > and com.datatorrent.stram.webapp.StramWebServices.getPhysicalPlan()
> used
>     > inside the Stram Web service. There will be running indexes in
>     > StreamingContainerManager to keep track of MMM for the logical plan
> and
>     > physical plan. The appropriate save method will be called on the
> occurrence
>     > of any event that updates the logical or physical plan for example:
>     >
>     >
>     >
>     > inside com.datatorrent.stram.StreamingContainerManager.
>     > LogicalPlanChangeRunnable.call()  for logical plan change event
>     >
>     >
>     >
>     > inside com.datatorrent.stram.plan.physical.PhysicalPlan.
> redoPartitions(PMapping,
>     > String) for physical plan change event (i.e. redoing partitioning)
>     >
>     >
>     >
>     > Once these files are created, any user or a tool (such as the Apex
> CLI or
>     > the DT Gateway) can look up these files for
> troubleshooting/researching of
>     > “dead” applications and significant events in their lifetime in
> terms of
>     > logical or physical plan changes. Pls send me your feedback.
>     >
>     >
>     >
>     > Sanjay
>     >
>     >
>     >
>     >
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message