pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yongzhi Wang <wang.yongzhi2...@gmail.com>
Subject Re: View Map-Reduce payload
Date Tue, 06 Mar 2012 18:00:58 GMT
Hi,

Sorry to bother.

I tried to use the syntax "explain", but the MapReduce plan displayed
sometime still makes me feel confused.

I tried such syntax below:

*my_raw = LOAD './houred-small' USING PigStorage('\t') AS (user,hour,
query);
part1 = filter my_raw by hour>11;
part2 = filter my_raw by hour<13;
result = cogroup part1 by hour, part2 by hour;
dump result;
explain result;*

The job stats shows as blow, indicating there are 2 Map tasks and 1 reduce
tasks. But I don't know how does the Map task is mapping to the MapReduce
plan shown below. It seems each Map task just do one filter and rearrange,
but on which phase the union operation is done? the shuffle phase? If in
that case, two Map tasks actually done different filter work. Is that
possible? Or my guess is wrong?

So, back to the question: *Is there any way that I can see the actual map
and reduce task executed in the pig?*

*Job Stats (time in seconds):
JobId   Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime
MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
job_201203021230_0038   2       1       3       3       3       12
12     1    2       my_raw,part1,part2,result       COGROUP
hdfs://master:54310/tmp/temp6260
37557/tmp-1661404166,
*

The mapreduce plan shows as below:*
#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-84
Map Plan
Union[tuple] - scope-85
|
|---result: Local Rearrange[tuple]{bytearray}(false) - scope-73
|   |   |
|   |   Project[bytearray][1] - scope-74
|   |
|   |---part1: Filter[bag] - scope-59
|       |   |
|       |   Greater Than[boolean] - scope-63
|       |   |
|       |   |---Cast[int] - scope-61
|       |   |   |
|       |   |   |---Project[bytearray][1] - scope-60
|       |   |
|       |   |---Constant(11) - scope-62
|       |
|       |---my_raw: New For Each(false,false,false)[bag] - scope-89
|           |   |
|           |   Project[bytearray][0] - scope-86
|           |   |
|           |   Project[bytearray][1] - scope-87
|           |   |
|           |   Project[bytearray][2] - scope-88
|           |
|           |---my_raw:
Load(hdfs://master:54310/user/root/houred-small:PigStorage('    ')) -
scope-90
|
|---result: Local Rearrange[tuple]{bytearray}(false) - scope-75
    |   |
    |   Project[bytearray][1] - scope-76
    |
    |---part2: Filter[bag] - scope-66
        |   |
        |   Less Than[boolean] - scope-70
        |   |
        |   |---Cast[int] - scope-68
        |   |   |
        |   |   |---Project[bytearray][1] - scope-67
        |   |
        |   |---Constant(13) - scope-69
        |
        |---my_raw: New For Each(false,false,false)[bag] - scope-94
            |   |
            |   Project[bytearray][0] - scope-91
            |   |
            |   Project[bytearray][1] - scope-92
            |   |
            |   Project[bytearray][2] - scope-93
            |
            |---my_raw:
Load(hdfs://master:54310/user/root/houred-small:PigStorage('    ')) -
scope-95--------
Reduce Plan
result: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-77
|
|---result: Package[tuple]{bytearray} - scope-72--------
Global sort: false
----------------*

Thanks!


On Tue, Mar 6, 2012 at 11:21 AM, Aniket Mokashi <aniket486@gmail.com> wrote:

> http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#EXPLAIN
>
> On Tue, Mar 6, 2012 at 5:28 AM, shan shan <mysub987@gmail.com> wrote:
>
> > Hi
> > Can  I see the user-payload for the MapReduce job that is created by Pig.
> > How?
> > i.e. the Map and Reduce function code that is generated by Pig script..
> >
> > Thanks,
> >
>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message