Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D2FB310322 for ; Tue, 27 May 2014 19:41:01 +0000 (UTC) Received: (qmail 1930 invoked by uid 500); 27 May 2014 19:41:01 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 1892 invoked by uid 500); 27 May 2014 19:41:01 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 1884 invoked by uid 500); 27 May 2014 19:41:01 -0000 Delivered-To: apmail-incubator-crunch-dev@incubator.apache.org Received: (qmail 1881 invoked by uid 99); 27 May 2014 19:41:01 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 May 2014 19:41:01 +0000 Date: Tue, 27 May 2014 19:41:01 +0000 (UTC) From: "Micah Whitacre (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CRUNCH-272) Unable to correlate crunch jobs within Oozie MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CRUNCH-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010181#comment-14010181 ] Micah Whitacre commented on CRUNCH-272: --------------------------------------- So unfortunately my approach is not a complete solution. Specifically I missed this line[1] of code that is embedded inside of the launcher action that actually ties the properties back into the action and subsequently had the values stored in the Oozie. This means that we will need a custom Oozie launching action/code which isn't horrible but I'm not sure we have a set structure to be able to create a schema for launching Crunch pipelines. [1] - https://github.com/cloudera/oozie/blob/a659fd0f2e56850a35e38a6174667b0c07a75b57/core/src/main/java/org/apache/oozie/action/hadoop/HiveActionExecutor.java#L123 > Unable to correlate crunch jobs within Oozie > -------------------------------------------- > > Key: CRUNCH-272 > URL: https://issues.apache.org/jira/browse/CRUNCH-272 > Project: Crunch > Issue Type: Improvement > Reporter: Mike Zimmerman > Assignee: Micah Whitacre > Attachments: CRUNCH-272_prototype.patch > > > I'm not really sure if this should be logged to Oozie or to Crunch, so please feel free to move as needed. > I would like to request a way to decorate map/reduce jobs that are spawned by a Crunch pipeline so that I can programmatically determine their origin. The primary use case for this is integration with Oozie. Oozie launches a single map job to run a java action (in our case this java action runs a crunch job). Traceability from this original "launcher" job to the jobs created by the crunch job is impossible without trolling logs. This leaves a big black hole for the system operator to assess the performance/impact of these jobs. My initial thought was to provide a simple way to indicate a correlationId or similar on a map/reduce job and then make it accessible within Oozie to query for. Obviously, that request would have to come after the correlation feature was available within map/reduce. -- This message was sent by Atlassian JIRA (v6.2#6252)