impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matyas Orhidi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IMPALA-5589) set; shows COMPRESSION_CODEC: [NONE] by default
Date Tue, 27 Jun 2017 21:34:00 GMT
Matyas Orhidi created IMPALA-5589:
-------------------------------------

             Summary: set; shows COMPRESSION_CODEC: [NONE] by default
                 Key: IMPALA-5589
                 URL: https://issues.apache.org/jira/browse/IMPALA-5589
             Project: IMPALA
          Issue Type: Bug
            Reporter: Matyas Orhidi


The default value of COMPRESSION_CODEC in impala-shel is incorrect, it is NONE instead of
SNAPPY:

[morhidi-511-2:21000] > set;
...
COMPRESSION_CODEC: [NONE]
...
Although it creates a snappy compressed parquet:

[morhidi-511-2:21000] > create table customers_default stored as parquet as select * from
customers;
Query: create table customers_default stored as parquet as select * from customers
Query submitted at: 2017-06-27 14:13:02 (Coordinator: http://morhidi-511-2.gce.cloudera.com:25000)
Query progress can be monitored at: http://morhidi-511-2.gce.cloudera.com:25000/query_plan?query_id=b04ca94585082aec:8eef8ea500000000
+--------------------+
| summary            |
+--------------------+
| Inserted 53 row(s) |
+--------------------+
Fetched 1 row(s) in 6.83s

[root@morhidi-511-1 ~]# hdfs dfs -ls /user/hive/warehouse/customers_default
Found 2 items
drwxrwx--x+  - hive hive          0 2017-06-27 14:13 /user/hive/warehouse/customers_default/_impala_insert_staging
-rwxrwx--x+  3 hive hive       1560 2017-06-27 14:13 /user/hive/warehouse/customers_default/b04ca94585082aec-8eef8ea500000000_333480558_data.0.parq
[root@morhidi-511-1 ~]# hdfs dfs -get /user/hive/warehouse/customers_default/b04ca94585082aec-8eef8ea500000000_333480558_data.0.parq


[root@morhidi-511-1 ~]# parquet-tools meta b04ca94585082aec-8eef8ea500000000_333480558_data.0.parq
creator:     impala version 2.8.0-cdh5.11.0 (build e09660de6b503a15f07e84b99b63e8e745854c34)


file schema: schema 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id:          OPTIONAL INT32 R:0 D:1
name:        OPTIONAL BINARY R:0 D:1

row group 1: RC:53 TS:1281 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id:           INT32 SNAPPY DO:4 FPO:235 SZ:300/294/0.98 VC:53 ENC:RLE,PLAIN,PLAIN_DICTIONARY
name:         BINARY SNAPPY DO:337 FPO:1249 SZ:981/1063/1.08 VC:53 ENC:RLE,PLAIN,PLAIN_DICTIONARY
[root@morhidi-511-1 ~]# 

If you set it explicitly to NONE within the session, it still looks the same:


set COMPRESSION_CODEC=NONE;

...
COMPRESSION_CODEC: [NONE]
...

however this time a CTAS creates uncompressed parquets:

[morhidi-511-2:21000] > create table customers_none stored as parquet as select * from
customers;
Query: create table customers_none stored as parquet as select * from customers
Query submitted at: 2017-06-27 14:21:36 (Coordinator: http://morhidi-511-2.gce.cloudera.com:25000)
Query progress can be monitored at: http://morhidi-511-2.gce.cloudera.com:25000/query_plan?query_id=25483de926163646:fba7ab500000000
+--------------------+
| summary            |
+--------------------+
| Inserted 53 row(s) |
+--------------------+

[root@morhidi-511-1 ~]# hdfs dfs -ls /user/hive/warehouse/customers_none
Found 2 items
-rwxrwx--x+  3 hive hive       1636 2017-06-27 14:21 /user/hive/warehouse/customers_none/25483de926163646-fba7ab500000000_1792268023_data.0.parq
drwxrwx--x+  - hive hive          0 2017-06-27 14:21 /user/hive/warehouse/customers_none/_impala_insert_staging
[root@morhidi-511-1 ~]# hdfs dfs -get /user/hive/warehouse/customers_none/25483de926163646-fba7ab500000000_1792268023_data.0.parq
[root@morhidi-511-1 ~]# parquet-tools meta 25483de926163646-fba7ab500000000_1792268023_data.0.parq
creator:     impala version 2.8.0-cdh5.11.0 (build e09660de6b503a15f07e84b99b63e8e745854c34)


file schema: schema 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id:          OPTIONAL INT32 R:0 D:1
name:        OPTIONAL BINARY R:0 D:1

row group 1: RC:53 TS:1357 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id:           INT32 UNCOMPRESSED DO:4 FPO:231 SZ:294/294/1.00 VC:53 ENC:PLAIN_DICTIONARY,PLAIN,RLE
name:         BINARY UNCOMPRESSED DO:331 FPO:1327 SZ:1063/1063/1.00 VC:53 ENC:PLAIN_DICTIONARY,PLAIN,RLE






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message