crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-438) Visualizations of some important internal/intermediate pipeline planning states
Date Mon, 07 Jul 2014 20:24:35 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054166#comment-14054166
] 

Gabriel Reid commented on CRUNCH-438:
-------------------------------------

{quote}For the experiment i've also integrate this with the PlanningParameters.PIPELINE_DOTFILE_OUTPUT_DIR
(CRUNCH-418) If the PIPELINE_DOTFILE_OUTPUT_DIR path is set then 5 dotfiles will be produced.

I agree with Gabriel Reid that those diagrams are more like a debug tool. I the PIPELINE_DOTFILE_OUTPUT_DIR
is not for debugging purpose? then perhaps I should revert this integration?{quote}

The way I see it, the PIPELINE_DOTFILE_OUTPUT_DIR and {{PlanningParameters.PIPELINE_PLAN_DOTFILE}}
are for helping Crunch users understand what their pipelines are doing and for pinpointing
performance issues, etc (at least that's how I use it). I guess you could call it debugging
tools, but they're more for people using Crunch as a library. I think these new dotfiles are
more for understanding the inner workings of the planner, which is why I think it's better
to not just dump them in the PIPELINE_DOTFILE_OUTPUT_DIR. Just my opinion of course.

Another nitpick on something minor: am I correct in assuming that BASE_GRAPH_PLANE_DOTFILE
should be BASE_GRAPH_PLAN_DOTFILE (i.e. PLAN vs PLANE)?




> Visualizations of some important internal/intermediate pipeline planning states
> -------------------------------------------------------------------------------
>
>                 Key: CRUNCH-438
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-438
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.10.0, 0.8.3
>            Reporter: Christian Tzolov
>            Assignee: Christian Tzolov
>         Attachments: CRUNCH-438.2.patch, CRUNCH-438.patch
>
>
> To improve the understability of the pipeline planning stages it would help to visualize
some intermediate planning states like:
> - PCollection lineage. (visualizing the output-pcollection-targets structure) 
> - MSCRPlanner's planning Graphs before and after the split up of dependent GBK nodes
> - RTNode hierarchy along with the Input and Output configurations as persistent in the
Configuration before the execution of the pipeline. 
> Most of the information can be intercepted in the MSCRPlanner#plan()  method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message