hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suhas Satish (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-8700) Replace ReduceSink to HashTableSink (or equi.) for small tables [Spark Branch]
Date Wed, 05 Nov 2014 20:24:35 GMT

     [ https://issues.apache.org/jira/browse/HIVE-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Suhas Satish updated HIVE-8700:
-------------------------------
    Status: Patch Available  (was: Open)

> Replace ReduceSink to HashTableSink (or equi.) for small tables [Spark Branch]
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-8700
>                 URL: https://issues.apache.org/jira/browse/HIVE-8700
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Suhas Satish
>         Attachments: HIVE-8700-spark.patch, HIVE-8700.patch
>
>
> With HIVE-8616 enabled, the new plan has ReduceSinkOperator for the small tables. For
example, the follow represents the operator plan for the small table dec1 derived from query
{code}explain select /*+ MAPJOIN(dec)*/ * from dec join dec1 on dec.value=dec1.d;{code}
> {code}
>         Map 2 
>             Map Operator Tree:
>                 TableScan
>                   alias: dec1
>                   Statistics: Num rows: 0 Data size: 107 Basic stats: PARTIAL Column
stats: NONE
>                   Filter Operator
>                     predicate: d is not null (type: boolean)
>                     Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats:
NONE
>                     Reduce Output Operator
>                       key expressions: d (type: decimal(5,2))
>                       sort order: +
>                       Map-reduce partition columns: d (type: decimal(5,2))
>                       Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats:
NONE
>                       value expressions: i (type: int)
> {code}
> With the new design for broadcasting small tables, we need to convert the ReduceSinkOperator
with HashTableSinkOperator or equivalent in the new plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message