hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pamela Vagata (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used
Date Fri, 12 Apr 2013 19:12:16 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630511#comment-13630511
] 

Pamela Vagata commented on HIVE-4318:
-------------------------------------

Thanks for running these separately :) I just looked in OperatorHookUtils.java which is where
the opHooks list is being initialized - it looks like the opHooks list is always being initialized
even if there are no OperatorHooks installed. My suspicion is that if we returned null instead
of an empty list, the numbers would be different since a null check should be much cheaper.
Would you mind modifying OperatorHookUtils.getOperatorHooks to return null instead of an empty
list and then rerun the MBM with the code for the OperatorHooks left in and also commented
out to see what the difference is?
                
> OperatorHooks hit performance even when not used
> ------------------------------------------------
>
>                 Key: HIVE-4318
>                 URL: https://issues.apache.org/jira/browse/HIVE-4318
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>         Environment: Ubuntu LXC (64 bit)
>            Reporter: Gopal V
>            Assignee: Gunther Hagleitner
>         Attachments: HIVE-4318.1.patch
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when it is not
being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has to forward
the row - the more operators there are the, the slower the query.
> The modification made to test this was 
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws HiveException {
>        return;
>      }
>      OperatorHookContext opHookContext = new OperatorHookContext(this, row, tag);
> -    preProcessCounter();
> -    enterOperatorHooks(opHookContext);
> +    //preProcessCounter();
> +    //enterOperatorHooks(opHookContext);
>      processOp(row, tag);
> -    exitOperatorHooks(opHookContext);
> -    postProcessCounter();
> +    //exitOperatorHooks(opHookContext);
> +    //postProcessCounter();
>    }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message