hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gunther Hagleitner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used
Date Thu, 11 Apr 2013 22:15:19 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629474#comment-13629474
] 

Gunther Hagleitner commented on HIVE-4318:
------------------------------------------

[~kevinwilfong]: Good point, I didn't see that the counters were removed as well in the original
run. I'll get more numbers.

However in general, I think the process method is a bad place to put hooks into (or anything
other than processing rows really). If the profiler is the only thing that uses it (and there
seems to be agreement that it's useful), we should still remove it for now and re-write that
in a way that has no effect on production systems.


                
> OperatorHooks hit performance even when not used
> ------------------------------------------------
>
>                 Key: HIVE-4318
>                 URL: https://issues.apache.org/jira/browse/HIVE-4318
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>         Environment: Ubuntu LXC (64 bit)
>            Reporter: Gopal V
>            Assignee: Gunther Hagleitner
>         Attachments: HIVE-4318.1.patch
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when it is not
being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has to forward
the row - the more operators there are the, the slower the query.
> The modification made to test this was 
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws HiveException {
>        return;
>      }
>      OperatorHookContext opHookContext = new OperatorHookContext(this, row, tag);
> -    preProcessCounter();
> -    enterOperatorHooks(opHookContext);
> +    //preProcessCounter();
> +    //enterOperatorHooks(opHookContext);
>      processOp(row, tag);
> -    exitOperatorHooks(opHookContext);
> -    postProcessCounter();
> +    //exitOperatorHooks(opHookContext);
> +    //postProcessCounter();
>    }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message