hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-16100) Dynamic Sorted Partition optimizer loses sibling operators
Date Mon, 06 Mar 2017 10:52:32 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15897084#comment-15897084
] 

Gopal V edited comment on HIVE-16100 at 3/6/17 10:51 AM:
---------------------------------------------------------

The scenario is more along this

{code}
TS -> FIL -> SEL -> FS
                      |
                      + -> FS
{code}

{code}
Stage-4
  Stats-Aggr Operator
    Stage-0
      Move Operator
        table:{"name:":"testing.over1k_part4_0"}
        Stage-3
          Dependency Collection{}
            Stage-2
              Map 1 vectorized, llap
              File Output Operator [FS_10]
                table:{"name:":"testing.over1k_part4_0"}
                Select Operator [SEL_9] (rows=1 width=0)
                  Output:["_col0","_col1"]
                  Filter Operator [FIL_8] (rows=1 width=0)
                    predicate:(s like 'bob%')
                    TableScan [TS_0] (rows=1 width=0)
                      testing@over1k,over1k,Tbl:PARTIAL,Col:NONE,Output:["i","s"]
              File Output Operator [FS_11]
                table:{"name:":"testing.over1k_part4_1"}
                 Please refer to the previous Select Operator [SEL_9]
{code}

SEL_9 -> FS_11 
SEL_9 -> FS_10 

making the FS op have 2 parameters.

However, I see that the test-case passes even without the patch - because the backtracking
is not cleared. Looks like the FS_11 -> SEL_9 parent relationship isn't modified by the
optimizer.


was (Author: gopalv):
The scenario is more along this

{code}
TS -> FIL -> SEL -> FS
                       |
                      + -> FS
{code}

{code}
Stage-4
  Stats-Aggr Operator
    Stage-0
      Move Operator
        table:{"name:":"testing.over1k_part4_0"}
        Stage-3
          Dependency Collection{}
            Stage-2
              Map 1 vectorized, llap
              File Output Operator [FS_10]
                table:{"name:":"testing.over1k_part4_0"}
                Select Operator [SEL_9] (rows=1 width=0)
                  Output:["_col0","_col1"]
                  Filter Operator [FIL_8] (rows=1 width=0)
                    predicate:(s like 'bob%')
                    TableScan [TS_0] (rows=1 width=0)
                      testing@over1k,over1k,Tbl:PARTIAL,Col:NONE,Output:["i","s"]
              File Output Operator [FS_11]
                table:{"name:":"testing.over1k_part4_1"}
                 Please refer to the previous Select Operator [SEL_9]
{code}

SEL_9 -> FS_11 
SEL_9 -> FS_10 

making the FS op have 2 parameters.

However, I see that the test-case passes even without the patch - because the backtracking
is not cleared. Looks like the FS_11 -> SEL_9 parent relationship isn't modified by the
optimizer.

> Dynamic Sorted Partition optimizer loses sibling operators
> ----------------------------------------------------------
>
>                 Key: HIVE-16100
>                 URL: https://issues.apache.org/jira/browse/HIVE-16100
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 1.2.1, 2.2.0, 2.1.1
>            Reporter: Gopal V
>            Assignee: Gopal V
>         Attachments: HIVE-16100.1.patch, HIVE-16100.2.patch, HIVE-16100.2.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java#L173
> {code}
>       // unlink connection between FS and its parent
>       fsParent = fsOp.getParentOperators().get(0);
>       fsParent.getChildOperators().clear();
> {code}
> The optimizer discards any cases where the fsParent has another SEL child 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message