hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-908) Need a way to correlate MR jobs with Pig statements
Date Tue, 04 Aug 2009 20:00:14 GMT

    [ https://issues.apache.org/jira/browse/PIG-908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739125#action_12739125
] 

Dmitriy V. Ryaboy commented on PIG-908:
---------------------------------------

An idea for something might work (haven't evaluated the complexity of implementing this)

When LogicalOperators are created, a bit of metadata is attached to them, listing the line
number that they come from.  Multiple LOs may be created from a single line, and multiple
lines may be associated with a single operator. 

This metadata is passed down to Physical Operators.

When an MR job is created, a log message is written listing the line numbers that are associated
with the POs in this map-reduce job, and the job name.

Thoughts?

> Need a way to correlate MR jobs with Pig statements
> ---------------------------------------------------
>
>                 Key: PIG-908
>                 URL: https://issues.apache.org/jira/browse/PIG-908
>             Project: Pig
>          Issue Type: Wish
>            Reporter: Dmitriy V. Ryaboy
>
> Complex Pig Scripts often generate many Map-Reduce jobs, especially with the recent introduction
of multi-store capabilities.
> For example, the first script in the Pig tutorial produces 5 MR jobs.
> There is currently very little support for debugging resulting jobs; if one of the MR
jobs fails, it is hard to figure out which part of the script it was responsible for. Explain
plans help, but even with the explain plan, a fair amount of effort (and sometimes, experimentation)
is required to correlate the failing MR job with the corresponding PigLatin statements.
> This ticket is created to discuss approaches to alleviating this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message