tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Min Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TAJO-283) Add Table Partitioning
Date Mon, 23 Dec 2013 02:11:50 GMT

    [ https://issues.apache.org/jira/browse/TAJO-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855341#comment-13855341
] 

Min Zhou commented on TAJO-283:
-------------------------------

I spent couple of hours into those patches related to this feature.  Finally realized that
it's quit different from hive's partition, but similar with hive's bucket which is generated
by DISTRIBUTED BY / CLUSTERED BY clause.

{noformat}
CREATE TABLE user_info_bucketed(user_id BIGINT, firstname STRING, lastname STRING) 
COMMENT 'A bucketed copy of user_info' 
CLUSTERED BY(user_id) INTO 256 BUCKETS;
{noformat}

Hive's partition is quite simple, normally each partition map to a HDFS directory. It's used
like a column of a table . For example SELECT * FROM tbl WHERE part_date = '20131222'.  There
is one record in hive's metadata for storing one partition of a hive table. Thus if someone
just select one partition,  say '20131222',  hive find the partitions involved by the SQL
through metadata,  and skip the hdfs directories which is not useful for the query.   Those
above, so called partition pruning, are executed by a the planner side of hive.

While, tajo store only one record in catalog for the partitions of a table,  storing the quantity
of  those partitions rather than storing the details for each partition.  This may works on
hashed/columned partitions, but how about list/ range partitions?  Further more, if we wanna
benefit from partition pruning like hive did,  how can we skip the I/O when there isn't any
metadata recording the io path for each partition?

The reason why I think tajo's partition is like hive's bucket is that both are designed to
distribute their row according to one column's value of this row. In the early days of hive,
we use hive like this way. Each table has a daily update, people need create a branch new
table for the new report day, like tbl_20090101, tbl_20090102, ....  This is quite ugly and
mess.  So facebook guys create table partitioning, and later partition pruning to avoid scanning
the whole table. 



> Add Table Partitioning
> ----------------------
>
>                 Key: TAJO-283
>                 URL: https://issues.apache.org/jira/browse/TAJO-283
>             Project: Tajo
>          Issue Type: New Feature
>          Components: catalog, physical operator, planner/optimizer
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.8-incubating
>
>
> Table partitioning gives many facilities to maintain large tables. First of all, it enables
the data management system to prune many input data which are actually not necessary. In addition,
it gives the system more optimization  opportunities  that exploit the physical layouts.
> Basically, Tajo should follow the RDBMS-style partitioning system, including range, list,
hash, and so on. In order to keep Hive compatibility, we need to add Hive partition type that
does not exists in existing DBMS systems.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message