hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Wilfong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used
Date Fri, 12 Apr 2013 21:18:16 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630637#comment-13630637
] 

Kevin Wilfong commented on HIVE-4318:
-------------------------------------

It's not clear to me that we can't cut down the cost added by operator hooks when there are
no operator hooks present to the point where it does not significantly affect performance.

Pam, could you provide Gunther a patch which sets the list of operator hooks to null rather
than the empty list, and initializes the OperatorHookContext in the calls to enterOperatorHooks
and exitOperatorHooks after the check if the list is null.  This should limit the impact of
operator hooks, to two method calls and two null checks.  We could even put the check if this.operatorHooks==null
around the method calls themselves, in case the Java compiler isn't inlining it for some reason.

If after that, they still introduce a substantial amount of overhead, there's not much more
we can do, and I'd be ok with removing operator hooks. 
                
> OperatorHooks hit performance even when not used
> ------------------------------------------------
>
>                 Key: HIVE-4318
>                 URL: https://issues.apache.org/jira/browse/HIVE-4318
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>         Environment: Ubuntu LXC (64 bit)
>            Reporter: Gopal V
>            Assignee: Gunther Hagleitner
>         Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when it is not
being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has to forward
the row - the more operators there are the, the slower the query.
> The modification made to test this was 
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws HiveException {
>        return;
>      }
>      OperatorHookContext opHookContext = new OperatorHookContext(this, row, tag);
> -    preProcessCounter();
> -    enterOperatorHooks(opHookContext);
> +    //preProcessCounter();
> +    //enterOperatorHooks(opHookContext);
>      processOp(row, tag);
> -    exitOperatorHooks(opHookContext);
> -    postProcessCounter();
> +    //exitOperatorHooks(opHookContext);
> +    //postProcessCounter();
>    }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message