Return-Path: X-Original-To: apmail-incubator-crunch-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-crunch-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8CCBBD267 for ; Tue, 5 Feb 2013 03:33:23 +0000 (UTC) Received: (qmail 49403 invoked by uid 500); 5 Feb 2013 03:33:23 -0000 Delivered-To: apmail-incubator-crunch-user-archive@incubator.apache.org Received: (qmail 49377 invoked by uid 500); 5 Feb 2013 03:33:23 -0000 Mailing-List: contact crunch-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: crunch-user@incubator.apache.org Delivered-To: mailing list crunch-user@incubator.apache.org Received: (qmail 49362 invoked by uid 99); 5 Feb 2013 03:33:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 03:33:22 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rahul0208@gmail.com designates 74.125.82.169 as permitted sender) Received: from [74.125.82.169] (HELO mail-we0-f169.google.com) (74.125.82.169) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 03:33:17 +0000 Received: by mail-we0-f169.google.com with SMTP id t11so5719154wey.28 for ; Mon, 04 Feb 2013 19:32:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=Mo/PNvogFSDO6W49bq0OONMoLdAwqVFE5kBwL6T/IJI=; b=bPeuaTkq9QIyrzXrJ96m2cx1UfU/mrIzDmr9sJU7Flrf4oQF6AtTUR0ThCE5EvTvop L2ZvC2oWGov7Tsao/khBKkayGxPA5FzEn5hK7MKv62JzzMuMKzYDdc0YYn7n7QgOmMbk k2cOXlrSIhfsehW6n7tjxLLMm7Lp9uxfg8Ag0hx489Pox92DgRQFQpiSFwUF/spcnCQv 96i5JIpAmCZSKcxuC1ZnTjB38c2ghX8XOBAX2dHM5RhtK3LEu2Ubt8D3qGqHV+66Q4Wt mqiIpZzXUFIAqLce1E2AtJb+6CT3J2VjzfzLHildg/AvY0bQTsZXUTPwkIY5rjNAAEei 1Tug== X-Received: by 10.180.72.148 with SMTP id d20mr14124250wiv.31.1360035176846; Mon, 04 Feb 2013 19:32:56 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.56.228 with HTTP; Mon, 4 Feb 2013 19:32:36 -0800 (PST) In-Reply-To: References: From: Rahul Sharma Date: Tue, 5 Feb 2013 09:02:36 +0530 Message-ID: Subject: Re: Visualize DAG of a pipeline To: crunch-user@incubator.apache.org Content-Type: multipart/alternative; boundary=f46d043be21c99172e04d4f1dd2e X-Virus-Checked: Checked by ClamAV on apache.org --f46d043be21c99172e04d4f1dd2e Content-Type: text/plain; charset=ISO-8859-1 Yes, a dot language file is generated in the pipeline. The file is a visualization of how MR jobs have been executed in the pipeline. You can access the same like : String dotFileContents = pipeline.getConfiguration().get(PlanningParameters.PIPELINE_PLAN_DOTFILE); The file can be analyzed with various tools like Graphviz. For more on DOT please check http://en.wikipedia.org/wiki/DOT_language On Tue, Feb 5, 2013 at 8:49 AM, Josh Wills wrote: > +greid > > Gabriel wrote one, IIRC-- I think that a .dot file with the plan for the > job gets embedded in the Configuration object returned from the planner. > > > On Mon, Feb 4, 2013 at 7:13 PM, Chao Shi wrote: > >> Hi crunch users, >> >> I would like to know if there are any tool to help me understand crunch >> optimized MR stages. >> >> Particularly, I think I need to see the DAG of job stages. I'm writing a >> pipeline consists of several joins. The pipeline produces significant >> more intermediate output than I expect. I want to investigate what's going >> wrong there. >> >> Thanks, >> Chao >> > > > > -- > Director of Data Science > Cloudera > Twitter: @josh_wills > --f46d043be21c99172e04d4f1dd2e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Yes, a dot language file is generated in the pipeline. The file is a visual= ization of how MR jobs have been executed in the pipeline. You can access t= he same like :

String=A0dotFileCon= tents=A0=3D=A0pipeline.getConfiguration().get(PlanningParameters.PIPELINE_P= LAN_DOTFILE);

The file can be analyzed with various tools lik= e Graphviz. For more on DOT please check http://en.wikipedia.org/wiki/D= OT_language


On Tue, Feb 5, 2013 at 8:49 AM= , Josh Wills <jwills@cloudera.com> wrote:
+greid

Gabriel wrote one, IIRC-- I think that a .do= t file with the plan for the job gets embedded in the Configuration object = returned from the planner.


On Mon, Feb 4, 2013 at 7:13 PM, Chao Shi <stepinto@live.com>= wrote:
Hi crunch users,

I would = like to know if there are any tool to help me understand crunch optimized M= R stages.

Particularly, I think I need to see the DAG of job stag= es. I'm writing a pipeline consists of several joins. The pipeline prod= uces significant more=A0intermediate=A0output than I expect. I want to inve= stigate what's going wrong there.

Thanks,
Chao



<= font color=3D"#888888">--
Director of Data Science
Twitter: @= josh_wills

--f46d043be21c99172e04d4f1dd2e--