crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chao Shi <stepi...@live.com>
Subject Re: Visualize DAG of a pipeline
Date Wed, 06 Feb 2013 03:55:29 GMT
Thanks you guys. It works.

I think it would be nice to have a web console. It may display some
interactive things, like which stage is running and click on the running
stage will navigate to the JT page.

An API to start a web console is simpler to define than exposing full
states. Including jetty may make the jar too large. This can be happen in
another jar: only users who use web console will need to load it.

On Tue, Feb 5, 2013 at 6:06 PM, Gabriel Reid <gabriel.reid@gmail.com> wrote:

> By the way, when this feature was originally implemented there was some
> discussion about the best way to expose it via the API, and exposing it via
> the Configuration was considered (probably) only temporary.
>
> If there are suggestions about the best way to expose this functionality
> via the API, please let me know.
>
> - Gabriel
>
>
> On Tue, Feb 5, 2013 at 4:32 AM, Rahul Sharma <rahul0208@gmail.com> wrote:
>
>> Yes, a dot language file is generated in the pipeline. The file is a
>> visualization of how MR jobs have been executed in the pipeline. You can
>> access the same like :
>>
>>
>> String dotFileContents = pipeline.getConfiguration().get(PlanningParameters.PIPELINE_PLAN_DOTFILE);
>>
>> The file can be analyzed with various tools like Graphviz. For more on
>> DOT please check http://en.wikipedia.org/wiki/DOT_language
>>
>>
>> On Tue, Feb 5, 2013 at 8:49 AM, Josh Wills <jwills@cloudera.com> wrote:
>>
>>> +greid
>>>
>>> Gabriel wrote one, IIRC-- I think that a .dot file with the plan for the
>>> job gets embedded in the Configuration object returned from the planner.
>>>
>>>
>>> On Mon, Feb 4, 2013 at 7:13 PM, Chao Shi <stepinto@live.com> wrote:
>>>
>>>> Hi crunch users,
>>>>
>>>> I would like to know if there are any tool to help me understand crunch
>>>> optimized MR stages.
>>>>
>>>> Particularly, I think I need to see the DAG of job stages. I'm writing
>>>> a pipeline consists of several joins. The pipeline produces significant
>>>> more intermediate output than I expect. I want to investigate what's going
>>>> wrong there.
>>>>
>>>> Thanks,
>>>> Chao
>>>>
>>>
>>>
>>>
>>> --
>>> Director of Data Science
>>> Cloudera <http://www.cloudera.com>
>>> Twitter: @josh_wills <http://twitter.com/josh_wills>
>>>
>>
>>
>

Mime
View raw message