hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-7384) Research into reduce-side join
Date Thu, 10 Jul 2014 23:15:06 GMT

     [ https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xuefu Zhang updated HIVE-7384:
------------------------------

    Description: 
Hive's join operator is very sophisticated, especially for reduce-side join. While we expect
that other types of join, such as map-side join and SMB map-side join, will work out of the
box with our design, there may be some complication in reduce-side join, which extensively
utilizes key tag and shuffle behavior. Our design principle prefers to making Hive implementation
work out of box also, which might requires new functionality from Spark. The tasks is to research
into this area, identifying requirements for Spark community and the work to be done on Hive
to make reduce-side join work.

A design doc might be needed for this. For more information, please refer to the overall design
doc on wiki.

  was:
Hive's join operator is very sophisticated, especially for reduce-side join. While we expect
that other types of join, such as map-side join and SMB map-side join, will work out of the
box with our design, there may be some complication in reduce-side join, which extensively
utilizes key tag and shuffle behavior. Our design principle prefer to make Hive implementation
work out of box also, which might requires new functionality from Spark. The tasks is to research
into this area, identifying requirements for Spark community and work to be done on Hive to
make reduce-side join work.

A design doc might be needed for this. For more information, please refer to the overall design
doc on wiki.


> Research into reduce-side join
> ------------------------------
>
>                 Key: HIVE-7384
>                 URL: https://issues.apache.org/jira/browse/HIVE-7384
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>
> Hive's join operator is very sophisticated, especially for reduce-side join. While we
expect that other types of join, such as map-side join and SMB map-side join, will work out
of the box with our design, there may be some complication in reduce-side join, which extensively
utilizes key tag and shuffle behavior. Our design principle prefers to making Hive implementation
work out of box also, which might requires new functionality from Spark. The tasks is to research
into this area, identifying requirements for Spark community and the work to be done on Hive
to make reduce-side join work.
> A design doc might be needed for this. For more information, please refer to the overall
design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message