tajo-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyoungjun Kim (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TAJO-897) PartitionedTableRewriter is repeated several times with same table.
Date Tue, 01 Jul 2014 14:41:24 GMT
Hyoungjun Kim created TAJO-897:
----------------------------------

             Summary: PartitionedTableRewriter is repeated several times with same table.

                 Key: TAJO-897
                 URL: https://issues.apache.org/jira/browse/TAJO-897
             Project: Tajo
          Issue Type: Bug
            Reporter: Hyoungjun Kim
            Assignee: Hyoungjun Kim
            Priority: Minor


See the title. 
If there is some block which contains partitioned table, PartitionedTableRewriter runs several
time. At first time after finding partition path, PartitionedTableRewriter removes partitioned
filter condition. So next time all partition is selected for scanning.
I ran the next query. customer_parts table is partitioned by c_nationkey.
{code:sql}
select a.c_custkey, b.c_custkey from 
 (select c_custkey, c_nationkey from customer_parts where c_nationkey < 0 
 union all 
  select c_custkey, c_nationkey from customer_parts where c_nationkey < 0 
) a
left outer join customer_parts b
on a.c_custkey = b.c_custkey 
and a.c_nationkey > 0
{code}


{noformat}
=======================================================
Block Id: eb_1404224996147_0002_000001 [LEAF]
=======================================================

[Outgoing]
[q_1404224996147_0002] 1 => 3 (type=HASH_SHUFFLE, key=default.a.c_custkey (INT4), num=32)

TABLE_SUBQUERY(19) as default.a
  => Targets: default.a.c_custkey (INT4) as default.a.c_custkey
  => out schema: {(1) default.a.c_custkey (INT4)}
  => in  schema: {(2) default.a.c_custkey (INT4),default.a.c_nationkey (INT4)}
   PARTITIONS_SCAN(16) on default.customer_parts
     => target list: default.customer_parts.c_custkey (INT4), default.customer_parts.c_nationkey
(INT4)
     => num of filtered paths: 5
     => out schema: {(2) default.customer_parts.c_custkey (INT4),default.customer_parts.c_nationkey
(INT4)}
     => in schema: {(7) default.customer_parts.c_custkey (INT4),default.customer_parts.c_name
(TEXT),default.customer_parts.c_address (TEXT),default.customer_parts.c_phone (TEXT),default.customer_parts.c_acctbal
(FLOAT8),default.customer_parts.c_mktsegment (TEXT),default.customer_parts.c_comment (TEXT)}
     => 0: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=1
     => 1: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=13
     => 2: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=15
     => 3: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=3
     => 4: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=4

=======================================================
Block Id: eb_1404224996147_0002_000002 [LEAF]
=======================================================

[Outgoing]
[q_1404224996147_0002] 2 => 3 (type=HASH_SHUFFLE, key=default.a.c_custkey (INT4), num=32)

TABLE_SUBQUERY(20) as default.a
  => Targets: default.a.c_custkey (INT4)
  => out schema: {(1) default.a.c_custkey (INT4)}
  => in  schema: {(2) default.a.c_custkey (INT4),default.a.c_nationkey (INT4)}
   PARTITIONS_SCAN(17) on default.customer_parts
     => target list: default.customer_parts.c_custkey (INT4), default.customer_parts.c_nationkey
(INT4)
     => num of filtered paths: 5
     => out schema: {(2) default.customer_parts.c_custkey (INT4),default.customer_parts.c_nationkey
(INT4)}
     => in schema: {(7) default.customer_parts.c_custkey (INT4),default.customer_parts.c_name
(TEXT),default.customer_parts.c_address (TEXT),default.customer_parts.c_phone (TEXT),default.customer_parts.c_acctbal
(FLOAT8),default.customer_parts.c_mktsegment (TEXT),default.customer_parts.c_comment (TEXT)}
     => 0: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=1
     => 1: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=13
     => 2: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=15
     => 3: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=3
     => 4: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=4

=======================================================
Block Id: eb_1404224996147_0002_000004 [LEAF]
=======================================================

[Outgoing]
[q_1404224996147_0002] 4 => 3 (type=HASH_SHUFFLE, key=default.b.c_custkey (INT4), num=32)

PARTITIONS_SCAN(15) on default.customer_parts
  => target list: default.b.c_custkey (INT4)
  => num of filtered paths: 5
  => out schema: {(1) default.b.c_custkey (INT4)}
  => in schema: {(7) default.b.c_custkey (INT4),default.b.c_name (TEXT),default.b.c_address
(TEXT),default.b.c_phone (TEXT),default.b.c_acctbal (FLOAT8),default.b.c_mktsegment (TEXT),default.b.c_comment
(TEXT)}
  => 0: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=1
  => 1: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=13
  => 2: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=15
  => 3: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=3
  => 4: hdfs://localhost:49896/tajo/warehouse/default/customer_parts/c_nationkey=4

=======================================================
Block Id: eb_1404224996147_0002_000003 [ROOT]
=======================================================
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message