carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Akash R Nilugal (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CARBONDATA-2021) when delete is success and update is failed while writing status file then a stale carbon data file is created.
Date Fri, 19 Jan 2018 13:39:00 GMT

     [ https://issues.apache.org/jira/browse/CARBONDATA-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Akash R Nilugal updated CARBONDATA-2021:
----------------------------------------
    Description: 
when delete is success and update is failed while writing status file then a stale carbon
data file is created.
 so removing that file on clean up . and also not considering that one during query.

when the update operation is running and the user stops it abruptly,
 then the carbon data file will be remained in the store .

so extra data is coming.

during the next update the clean up of the files need to be handled.
 and in query also new data file should be excluded.

 

CREATE TABLE uniqdata_string(CUST_ID int,CUST_NAME String,DOB timestamp,DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10),DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) PARTITIONED BY(ACTIVE_EMUI_VERSION string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('TABLE_BLOCKSIZE'= '256 MB');

 

 LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_string partition(active_emui_version='abc') OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');

 

CREATE TABLE uniqdata_hive (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 int)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

 

insert overwrite table uniqdata_string partition(active_emui_version='xxx') select CUST_ID, CUST_NAME,DOB,doj, bigint_column1, bigint_column2, decimal_column1, decimal_column2,double_column1, double_column2,integer_column1 from uniqdata_hive limit 10;

 9000,CUST_NAME_00000,ACTIVE_EMUI_VERSION_00000,1970-01-01 01:00:03,1970-01-01 02:00:03,123372036854,-223372036854,12345678901.1234000000,22345678901.1234000000,11234567489.7976000000,-11234567489.7976000000,1

  was:
when delete is success and update is failed while writing status file then a stale carbon
data file is created.
 so removing that file on clean up . and also not considering that one during query.

when the update operation is running and the user stops it abruptly,
 then the carbon data file will be remained in the store .

so extra data is coming.

during the next update the clean up of the files need to be handled.
 and in query also new data file should be excluded.

 


> when delete is success and update is failed while writing status file  then a stale carbon
data file is created.
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2021
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2021
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Akash R Nilugal
>            Assignee: Akash R Nilugal
>            Priority: Minor
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> when delete is success and update is failed while writing status file then a stale carbon
data file is created.
>  so removing that file on clean up . and also not considering that one during query.
> when the update operation is running and the user stops it abruptly,
>  then the carbon data file will be remained in the store .
> so extra data is coming.
> during the next update the clean up of the files need to be handled.
>  and in query also new data file should be excluded.
>  
> CREATE TABLE uniqdata_string(CUST_ID int,CUST_NAME String,DOB timestamp,DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10),DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) PARTITIONED BY(ACTIVE_EMUI_VERSION string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('TABLE_BLOCKSIZE'= '256 MB');
>  
>  LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_string partition(active_emui_version='abc') OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
>  
> CREATE TABLE uniqdata_hive (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 int)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
>  
> insert overwrite table uniqdata_string partition(active_emui_version='xxx') select CUST_ID, CUST_NAME,DOB,doj, bigint_column1, bigint_column2, decimal_column1, decimal_column2,double_column1, double_column2,integer_column1 from uniqdata_hive limit 10;
>  9000,CUST_NAME_00000,ACTIVE_EMUI_VERSION_00000,1970-01-01 01:00:03,1970-01-01 02:00:03,123372036854,-223372036854,12345678901.1234000000,22345678901.1234000000,11234567489.7976000000,-11234567489.7976000000,1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message