carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Geetika Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CARBONDATA-658) Compression is not working for BigInt and Int datatype
Date Wed, 18 Jan 2017 10:05:26 GMT

     [ https://issues.apache.org/jira/browse/CARBONDATA-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Geetika Gupta updated CARBONDATA-658:
-------------------------------------
    Description: 
I tried to load data into a table having bigInt as a column. Firstly I loaded small bigint
values to the table and noted down the carbondata file size then I loaded max bigint values
to the table and again noted the carbondata file size.

For large bigint values the carbondata file size was 684.25 Kb and for small bigint values
it was 684.26 Kb. So I could not figure out whether compression is performed or not.

I tried the same scenario with int datatype as well. For large int values the carbondata file
size was 684.24 Kb and for small int values it was 684.26 Kb.

Below are the queries:
For BigInt table:

Create table test(a BigInt, b String) stored by 'carbondata';

LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_LargeBigInt.csv' into table test
OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a');

LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_SmallBigInt.csv' into table test
OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a');

For Int table:

Create table test(a Int, b String) stored by 'carbondata';

LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_LargeInt.csv' into table test OPTIONS('DELIMITER'=','
, 'QUOTECHAR'='"','FILEHEADER'='b,a');

LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_SmallInt.csv' into table test OPTIONS('DELIMITER'=','
, 'QUOTECHAR'='"','FILEHEADER'='b,a');


  was:I tried to load data into a table having bigInt as a column. Firstly I loaded small
bigint values to the table and noted down the carbondata file size then I loaded max bigint
values to the table and again noted the carbondata file size.

     Attachment: 100000_SmallInt.csv
                 100000_LargeInt.csv
                 100000_SmallBigInt.csv
                 100000_LargeBigInt.csv
    Environment: spark 1.6, 2.0  (was: spark 1.6)
        Summary: Compression is not working for BigInt and Int datatype  (was: Compression
is not working for BigInt and Int)

> Compression is not working for BigInt and Int datatype
> ------------------------------------------------------
>
>                 Key: CARBONDATA-658
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-658
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-load
>    Affects Versions: 1.0.0-incubating
>         Environment: spark 1.6, 2.0
>            Reporter: Geetika Gupta
>         Attachments: 100000_LargeBigInt.csv, 100000_LargeInt.csv, 100000_SmallBigInt.csv,
100000_SmallInt.csv
>
>
> I tried to load data into a table having bigInt as a column. Firstly I loaded small bigint
values to the table and noted down the carbondata file size then I loaded max bigint values
to the table and again noted the carbondata file size.
> For large bigint values the carbondata file size was 684.25 Kb and for small bigint values
it was 684.26 Kb. So I could not figure out whether compression is performed or not.
> I tried the same scenario with int datatype as well. For large int values the carbondata
file size was 684.24 Kb and for small int values it was 684.26 Kb.
> Below are the queries:
> For BigInt table:
> Create table test(a BigInt, b String) stored by 'carbondata';
> LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_LargeBigInt.csv' into table
test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a');
> LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_SmallBigInt.csv' into table
test OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a');
> For Int table:
> Create table test(a Int, b String) stored by 'carbondata';
> LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_LargeInt.csv' into table test
OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a');
> LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/100000_SmallInt.csv' into table test
OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='b,a');



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message