hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suhas Satish (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-8616) convert joinOp to MapJoinOp and generate MapWorks only
Date Mon, 27 Oct 2014 21:46:33 GMT

     [ https://issues.apache.org/jira/browse/HIVE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Suhas Satish updated HIVE-8616:
-------------------------------
    Status: Patch Available  (was: Open)

Attached a patch which addresses this sub-task. With this patch applied, this is the explain
plan for a 3-way join. 

explain select * from table1 join table2 on (table1.key = table2.key) join table3 on table1.key
= table3.key;

OK

STAGE DEPENDENCIES:

  Stage-1 is a root stage

  Stage-0 depends on stages: Stage-1



STAGE PLANS:

  Stage: Stage-1

    Spark

      Edges:

        Map 1 <- Map 2 (NONE, 0), Map 3 (NONE, 0)

      DagName: ssatish_20141027131919_0ab004f6-5495-44b4-b7b1-16bf8ca15473:2

      Vertices:

        Map 1 

            Map Operator Tree:

                TableScan

                  alias: table1

                  Statistics: Num rows: 55 Data size: 5812 Basic stats: COMPLETE Column stats:
NONE

                  Filter Operator

                    predicate: key is not null (type: boolean)

                    Statistics: Num rows: 28 Data size: 2958 Basic stats: COMPLETE Column
stats: NONE

                    Map Join Operator

                      condition map:

                           Inner Join 0 to 1

                           Inner Join 0 to 2

                      condition expressions:

                        0 {key} {value}

                        1 {key} {value}

                        2 {key} {value}

                      keys:

                        0 key (type: int)

                        1 key (type: int)

                        2 key (type: int)

                      outputColumnNames: _col0, _col1, _col5, _col6, _col10, _col11

                      input vertices:

                        1 Map 3

                        2 Map 2

                      Statistics: Num rows: 61 Data size: 6507 Basic stats: COMPLETE Column
stats: NONE

                      Select Operator

                        expressions: _col0 (type: int), _col1 (type: string), _col5 (type:
int), _col6 (type: string), _col10 (type: int), _col11 (type: string)

                        outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5

                        Statistics: Num rows: 61 Data size: 6507 Basic stats: COMPLETE Column
stats: NONE

                        File Output Operator

                          compressed: false

                          Statistics: Num rows: 61 Data size: 6507 Basic stats: COMPLETE Column
stats: NONE

                          table:

                              input format: org.apache.hadoop.mapred.TextInputFormat

                              output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

                              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

        Map 2 

            Map Operator Tree:

                TableScan

                  alias: table3

                  Statistics: Num rows: 1 Data size: 140 Basic stats: COMPLETE Column stats:
NONE

                  Filter Operator

                    predicate: key is not null (type: boolean)

                    Statistics: Num rows: 1 Data size: 140 Basic stats: COMPLETE Column stats:
NONE

                    Reduce Output Operator

                      key expressions: key (type: int)

                      sort order: +

                      Map-reduce partition columns: key (type: int)

                      Statistics: Num rows: 1 Data size: 140 Basic stats: COMPLETE Column
stats: NONE

                      value expressions: value (type: string)

        Map 3 

            Map Operator Tree:

                TableScan

                  alias: table2

                  Statistics: Num rows: 55 Data size: 5791 Basic stats: COMPLETE Column stats:
NONE

                  Filter Operator

                    predicate: key is not null (type: boolean)

                    Statistics: Num rows: 28 Data size: 2948 Basic stats: COMPLETE Column
stats: NONE

                    Reduce Output Operator

                      key expressions: key (type: int)

                      sort order: +

                      Map-reduce partition columns: key (type: int)

                      Statistics: Num rows: 28 Data size: 2948 Basic stats: COMPLETE Column
stats: NONE

                      value expressions: value (type: string)



  Stage: Stage-0

    Fetch Operator

      limit: -1

      Processor Tree:

        ListSink



> convert joinOp to MapJoinOp and generate MapWorks only
> ------------------------------------------------------
>
>                 Key: HIVE-8616
>                 URL: https://issues.apache.org/jira/browse/HIVE-8616
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Suhas Satish
>            Assignee: Suhas Satish
>         Attachments: HIVE-8616-spark.patch
>
>
> This is a sub-task of map join on spark. 
> The parent jira is
> https://issues.apache.org/jira/browse/HIVE-7613



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message