carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neha Bhardwaj (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CARBONDATA-1427) After Splitting Partition, Data doesn't get Divided to Different Partitions.
Date Tue, 29 Aug 2017 10:38:00 GMT
Neha Bhardwaj created CARBONDATA-1427:
-----------------------------------------

             Summary: After Splitting Partition, Data doesn't get Divided to Different Partitions.
                 Key: CARBONDATA-1427
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1427
             Project: CarbonData
          Issue Type: Bug
          Components: data-query
         Environment: spark 2.1
            Reporter: Neha Bhardwaj
            Priority: Minor


When Performing a Split Partition Query on a Partitioned Table, The data doesn't get affected
at all, however, we can see the updated Partitions using the show Partitions Query and the
old partition as deleted.

But the data still remains in that partition, Ideally, the data should be divided as per the
new partitions, Which happens after the subsequent loads, the data then gets to the latest
partitions.

Example :
1. Create Table :
DROP TABLE IF EXISTS list_partition_table;

CREATE TABLE list_partition_table(shortField SHORT, intField INT, bigintField LONG, doubleField
DOUBLE, timestampField TIMESTAMP, decimalField DECIMAL(18,2), dateField DATE, charField CHAR(5),
floatField FLOAT, complexData ARRAY<STRING> ) PARTITIONED BY (stringField STRING) STORED
BY 'carbondata' TBLPROPERTIES('PARTITION_TYPE'='LIST', 'LIST_INFO'='Asia, (China, Europe,
NoPartition)');

2. Load Data :
 load data inpath 'hdfs://localhost:54310/CSV/list_partition_table.csv' into table list_partition_table
options('FILEHEADER'='shortfield,intfield,bigintfield,doublefield,stringfield,timestampfield,decimalfield,datefield,charfield,floatfield,complexdata',
'COMPLEX_DELIMITER_LEVEL_1'='$','COMPLEX_DELIMITER_LEVEL_2'='#');

3. Show Partitions :
show partitions list_partition_table;
+----------------------------------------------+--+
|                  partition                   |
+----------------------------------------------+--+
| 0, stringfield = DEFAULT                     |
| 1, stringfield = Asia                        |
| 2, stringfield = China, Europe, NoPartition  |
+----------------------------------------------+--+
3 rows selected (0.09 seconds)

4. Split Partition :
ALTER TABLE list_partition_table SPLIT PARTITION(2) INTO('China', '(Europe, NoPartition)'
);

5. Show Partition :
show partitions list_partition_table;
+---------------------------------------+--+
|               partition               |
+---------------------------------------+--+
| 0, stringfield = DEFAULT              |
| 1, stringfield = Asia                 |
| 3, stringfield = China                |
| 4, stringfield = Europe, NoPartition  |
+---------------------------------------+--+
4 rows selected (0.065 seconds)

The partitions get updated , but still the data remains the same(UNPARTITIONED), in the same
partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message