hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1544) Filtering out NULL-keyed rows in ReduceSinkOperator when no outer join involved
Date Mon, 16 Aug 2010 21:22:16 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899096#action_12899096
] 

Ning Zhang commented on HIVE-1544:
----------------------------------

The JoinDesc already has a flag noOuterJoin to keep track if there are outer joins involved
in the join operator. Based on that we should set a flag in the ReduceSinkDesc to indicate
whether NULL-keyed rows will be filtered out.

> Filtering out NULL-keyed rows in ReduceSinkOperator when no outer join involved
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-1544
>                 URL: https://issues.apache.org/jira/browse/HIVE-1544
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>
> As discussed in HIVE-741, if a plan indicates that a non-outer join is the first operator
in the reducer, the ReduceSinkOperator should filter out (not sending) rows with NULL as keys
since they will not generate any results anyways. This should save both bandwidth and processing
power. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message