hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-162) Rework mapreduce submission and monitoring
Date Fri, 09 May 2008 19:00:58 GMT

    [ https://issues.apache.org/jira/browse/PIG-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595690#action_12595690

Alan Gates commented on PIG-162:

I took a look at mapreduceJumboWithComInc.patch.  In general this looks good.  A couple of

1) I'm still unclear on why we now need a separate map class for map only jobs.  I see that
for map only jobs you are omitting the key and just collecting the tuple.  This makes sense.
 But if we need to do this how is it that Arun's patch 196 works?  (That's really a question
for Arun more than for you.)  You say it has something to do with types.  Based on a brief
glance at the current code, it looks like we're using the tuple for both key and value in
the current code and the first field of the tuple in the new code.  Is that what causes the
issue?  If so, what are the advantages of using the first field in the tuple instead of the
whole tuple as the key for sort and shuffle?

2) Please make your junit tests conform to the older style of junit.  Some users are still
on the older version, and we haven't explicitly required 4.x type junit.  This will mean changing
all of the method names in TestMRCompiler that start with test but aren't tests.

As a general note, we're all in agreement that having the separate reporter thread is an issue,
but you'll submit a fix for that in a subsequent patch.

I'm willing to submit this patch as is, except that the TestMRCompiler test fails in the unit
tests.  I'll attach the output of that test run separately.  From what I could tell the failure
was a real issue, the plan being generated did not appear to match the expected plan.

> Rework mapreduce submission and monitoring
> ------------------------------------------
>                 Key: PIG-162
>                 URL: https://issues.apache.org/jira/browse/PIG-162
>             Project: Pig
>          Issue Type: Sub-task
>         Environment: This bug tracks works to rework the submission and monitoring interface
to map reduce as described in  http://wiki.apache.org/pig/PigTypesFunctionalSpec
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: mapreduceJumbo.patch, mapreduceJumboWithComInc.patch, split.png,
TEST-org.apache.pig.test.TestMRCompiler.txt, TEST-org.apache.pig.test.TestUnion.txt

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message