carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravindra Pesala (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CARBONDATA-1424) Delete Operation working incorrectly when subquery returns bad-record
Date Wed, 06 Sep 2017 04:29:00 GMT

    [ https://issues.apache.org/jira/browse/CARBONDATA-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154790#comment-16154790
] 

Ravindra Pesala commented on CARBONDATA-1424:
---------------------------------------------

[~sangeeta04] It is not an issue as nulls cannot be filtered in that way. Please take the
below scenario and compare with hive.

{code}
select * from uniqdata_delete where cust_id in (select cust_id from uniqdata_delete limit
10)
{code}
The above query only returns non null data only even for hive as well. so delete also behaves
in the same way. Please verify the above query with hive and close it if not an issue

> Delete Operation working incorrectly when subquery returns bad-record
> ---------------------------------------------------------------------
>
>                 Key: CARBONDATA-1424
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1424
>             Project: CarbonData
>          Issue Type: Bug
>          Components: sql, test
>    Affects Versions: 1.2.0
>            Reporter: Sangeeta Gulia
>            Assignee: Ravindra Pesala
>            Priority: Minor
>         Attachments: 3000_UniqData.csv
>
>
> Delete Operation is working incorrectly when subquery returns bad-record for a particular
table. 
> For the given query, 
> delete from uniqdata_delete where cust_id in (select cust_id from uniqdata_delete limit
10);
> As an Example, if "select cust_id from uniqdata_delete limit 10" returns  
> +----------+--+
> | cust_id  |
> +----------+--+
> | NULL     |
> | NULL     |
> | NULL     |
> | NULL     |
> | 11000    |
> | 11001     |
> | 11002     |
> | 11003     |
> | 11004     |
> | 11005    |
> +----------+--+
> then the query should delete all rows where cust_id is Null or matches any values from
the returned values(11000-11005) whereas it deletes only those records where customer id is
from (11000-11005).
> I have attached the sample csv file which i have used for reference.
> To Regenerate the issue, you can use below commands : 
> CREATE TABLE uniqdata_delete (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string,
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1
decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,
INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"=
"256 MB");
> LOAD DATA INPATH 'hdfs://localhost:54310/user/hduser/input-files/3000_UniqData.csv' into
table uniqdata_delete OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> NOTE : Load should be such that starting rows of data should have null stored for cust_id
field. 
> delete from uniqdata_delete where cust_id in (select cust_id from uniqdata_delete limit
10);



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message