hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-460) Improve ColumnPruner to prune more aggressively and keep column information for input tables
Date Thu, 30 Apr 2009 01:36:30 GMT

    [ https://issues.apache.org/jira/browse/HIVE-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704434#action_12704434
] 

Zheng Shao commented on HIVE-460:
---------------------------------

Some example queries for improvement:

We should be able to prune "src.value as c2".
{code}
join11.q:
SELECT src1.c1, src2.c4
FROM
(SELECT src.key as c1, src.value as c2 from src) src1
JOIN
(SELECT src.key as c3, src.value as c4 from src) src2
ON src1.c1 = src2.c3 AND src1.c1 < 100;

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        src2:src
            Select Operator
              expressions:
                    expr: key
                    type: string
                    expr: value
                    type: string
              Reduce Output Operator
                key expressions:
                      expr: 0
                      type: string
                sort order: +
                Map-reduce partition columns:
                      expr: 0
                      type: string
                tag: 1
                value expressions:
                      expr: 0
                      type: string
                      expr: 1
                      type: string
        src1:src
            Select Operator
              expressions:
                    expr: key
                    type: string
                    expr: value
                    type: string
              Filter Operator
                predicate:
                    expr: (UDFToDouble(0) < UDFToDouble(100))
                    type: boolean
                Reduce Output Operator
                  key expressions:
                        expr: 0
                        type: string
                  sort order: +
                  Map-reduce partition columns:
                        expr: 0
                        type: string
                  tag: 0
                  value expressions:
                        expr: 0
                        type: string
                        expr: 1
                        type: string
      Reduce Operator Tree:
        Join Operator
          condition map:
               Inner Join 0 to 1
          condition expressions:
            0 {VALUE.0} {VALUE.1}
            1 {VALUE.0} {VALUE.1}
          Select Operator
            expressions:
                  expr: 0
                  type: string
                  expr: 3
                  type: string
            File Output Operator
              compressed: false
              GlobalTableId: 0
              table:
                  input format: org.apache.hadoop.mapred.TextInputFormat
                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

  Stage: Stage-0
    Fetch Operator
      limit: -1
{code}

We should be able to prune "b" from the first job.
{code}
> explain SELECT a from (SELECT * from zshao_lazy LIMIT 10) t;

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        t:zshao_lazy
            Select Operator
              expressions:
                    expr: a
                    type: int
                    expr: b
                    type: string
              Limit
                Reduce Output Operator
                  sort order:
                  tag: -1
                  value expressions:
                        expr: 0
                        type: int
                        expr: 1
                        type: string
      Reduce Operator Tree:
        Extract
          Limit
            Select Operator
              expressions:
                    expr: 0
                    type: int
              File Output Operator
                compressed: true
                GlobalTableId: 0
                table:
                    input format: org.apache.hadoop.mapred.TextInputFormat
                    output format: org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat

  Stage: Stage-0
    Fetch Operator
      limit: -1
{code}



> Improve ColumnPruner to prune more aggressively and keep column information for input
tables
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-460
>                 URL: https://issues.apache.org/jira/browse/HIVE-460
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Zheng Shao
>
> This is required for column-based table format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message