tajo-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyunsik Choi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TAJO-714) Enable setting Parquet tuning parameters
Date Wed, 02 Apr 2014 02:25:14 GMT

    [ https://issues.apache.org/jira/browse/TAJO-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957264#comment-13957264
] 

Hyunsik Choi commented on TAJO-714:
-----------------------------------

+1 for your latest patch.

I reviewed the patch. The patch is straightforward and works well.

The below is my test results.

{noformat:title='With SNAPPY compression'}
tpch> create table orders_parquet8 using parquet with ('parquet.compression' = 'SNAPPY')
as select * from orders;
Progress: 0%, response time: 0.392 sec
Progress: 0%, response time: 1.195 sec
Progress: 66%, response time: 2.198 sec
Progress: 100%, response time: 2.798 sec
final state: QUERY_SUCCEEDED, response time: 2.798 sec
OK

tpch> \d orders_parquet6

table name: tpch.orders_parquet6
table path: hdfs://127.0.0.1:8020/tajo/warehouse/tpch/orders_parquet6
store type: PARQUET
number of rows: 1500000
volume: 56.1 MB
Options: 
	'parquet.enable.dictionary'='true'
	'parquet.compression'='SNAPPY'
	'parquet.validation'='false'
	'parquet.page.size'='1048576'
	'parquet.block.size'='134217728'

schema: 
o_orderkey	INT8
o_custkey	INT8
o_orderstatus	TEXT
o_totalprice	FLOAT8
o_orderdate	TEXT
o_orderpriority	TEXT
o_clerk	TEXT
o_shippriority	INT4
o_comment	TEXT
{noformat}


{noformat:title='default setting'}
tpch> create table orders_parquet7 using parquet as select * from orders;
Progress: 0%, response time: 0.394 sec
Progress: 0%, response time: 1.196 sec
Progress: 66%, response time: 2.199 sec
Progress: 100%, response time: 2.683 sec
final state: QUERY_SUCCEEDED, response time: 2.683 sec
OK
tpch> \d orders_parquet7

table name: tpch.orders_parquet7
table path: hdfs://127.0.0.1:8020/tajo/warehouse/tpch/orders_parquet7
store type: PARQUET
number of rows: 1500000
volume: 115.3 MB
Options: 
	'parquet.enable.dictionary'='true'
	'parquet.compression'='uncompressed'
	'parquet.validation'='false'
	'parquet.page.size'='1048576'
	'parquet.block.size'='134217728'

schema: 
o_orderkey	INT8
o_custkey	INT8
o_orderstatus	TEXT
o_totalprice	FLOAT8
o_orderdate	TEXT
o_orderpriority	TEXT
o_clerk	TEXT
o_shippriority	INT4
o_comment	TEXT
{noformat}


> Enable setting Parquet tuning parameters
> ----------------------------------------
>
>                 Key: TAJO-714
>                 URL: https://issues.apache.org/jira/browse/TAJO-714
>             Project: Tajo
>          Issue Type: Improvement
>            Reporter: David Chen
>            Assignee: David Chen
>         Attachments: TAJO-714.patch, TAJO-714_2.patch, TAJO-714_20140331_19:21:16.patch,
TAJO-714_20140331_21:05:42.patch
>
>
> The first version of Parquet support does not support setting Parquet's tuning configuration
parameters, such as compression, row group and page size, dictionary encoding, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message