tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyunsik Choi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TAJO-283) Add Table Partitioning
Date Mon, 16 Dec 2013 04:57:06 GMT

    [ https://issues.apache.org/jira/browse/TAJO-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848822#comment-13848822
] 

Hyunsik Choi commented on TAJO-283:
-----------------------------------

Hi Min,

Now, we assume that one hdfs directory is one partition. First of all, we will support the
hive-style partition. Later, we will support range/interval partition, list, hash and their
composition partition. 

For hive-style partition, ColumnPartitionedTableStoreExec was implmented inTAJO-329. Tajo
already uses hash, range shuffles for distributed groupby, join, and sort. I'm expecting that
we can easily implement other types of partitions.

> Add Table Partitioning
> ----------------------
>
>                 Key: TAJO-283
>                 URL: https://issues.apache.org/jira/browse/TAJO-283
>             Project: Tajo
>          Issue Type: New Feature
>          Components: catalog, physical operator, planner/optimizer
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.8-incubating
>
>
> Table partitioning gives many facilities to maintain large tables. First of all, it enables
the data management system to prune many input data which are actually not necessary. In addition,
it gives the system more optimization  opportunities  that exploit the physical layouts.
> Basically, Tajo should follow the RDBMS-style partitioning system, including range, list,
hash, and so on. In order to keep Hive compatibility, we need to add Hive partition type that
does not exists in existing DBMS systems.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message