hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2479) Log more Hadoop task counter values in the MapRedStats class.
Date Tue, 04 Oct 2011 17:39:35 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120311#comment-13120311
] 

jiraposter@reviews.apache.org commented on HIVE-2479:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2167/
-----------------------------------------------------------

(Updated 2011-10-04 17:38:18.076011)


Review request for hive, Ramkumar Vadali and Yongqiang He.


Changes
-------

I added everything from Task$Counter except CPU_MILLISECONDS because currently that receives
special treatment in HadoopJobExecHelper.


Summary
-------

I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper
to collect them.

I got tired of writing the same code over and over again, so I modified the way MapRedStats
and HadoopJobExecHelper treat task counters.  MapRedStats now has an enum with all of the
task counters we want to collect, it is a subset of the enum in Task$Counter.  Task is package
private so the enum in it is unavailable.  MapRedStats now contains a map from the enum values
to the values of the counters, if they were set.  HadoopJobExecHelper loops over the enum
values and tries to get a value for each counter.  As long as the new getter and setter methods
are used the functionality is the same, in particular for the getter, if a counter was set,
it returns the value of the counter, otherwise it returns -1.


This addresses bug Hive-2479.
    https://issues.apache.org/jira/browse/Hive-2479


Diffs (updated)
-----

  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 

Diff: https://reviews.apache.org/r/2167/diff


Testing
-------

I ran some queries to verify the counters were being populated.

I also ran a few of the unit test queries to verify I hadn't broken anything.


Thanks,

Kevin


                
> Log more Hadoop task counter values in the MapRedStats class.
> -------------------------------------------------------------
>
>                 Key: HIVE-2479
>                 URL: https://issues.apache.org/jira/browse/HIVE-2479
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>
> We should log more of the Hadoop task tracker counters in the MapRedStats class, in order
to make them available to hooks and improve logging.  Specifically these are the counters
we should add:
>     MAP_SPILL_CPU,
>     MAP_SPILL_WALLCLOCK,
>     MAP_SPILL_NUMBER,
>     MAP_SPILL_BYTES,
>     MAP_MEM_SORT_CPU,
>     MAP_MEM_SORT_WALLCLOCK,
>     MAP_MERGE_CPU,
>     MAP_MERGE_WALLCLOCK,
>     REDUCE_SHUFFLE_BYTES,
>     REDUCE_COPY_WALLCLOCK,
>     REDUCE_COPY_CPU,
>     REDUCE_SORT_WALLCLOCK,
>     REDUCE_SORT_CPU,
>     MAP_TASK_WALLCLOCK,
>     REDUCE_TASK_WALLCLOCK,
>     MAP_INPUT_BYTES

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message