hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasad Chakka (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-718) Load data inpath into a new partition without overwrite does not move the file
Date Sat, 12 Sep 2009 02:00:57 GMT

    [ https://issues.apache.org/jira/browse/HIVE-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754461#action_12754461
] 

Prasad Chakka commented on HIVE-718:
------------------------------------

the only change i made was not to create the partition/table until Hive.copyFiles() returns.
i,e the partition/table directory was created (if it did not exist) before copyFiles() was
called in 0.3. It could be the reason for the discrepancy between 0.3 and 0.4 but I am not
sure.

We can't do the create the directory if we want to support correct semantics (i.e. the partition
directory does not exist until the data has been copied completely). This  is needed for both
COPY or REPLACE without which down stream data get corrupted/incomplete data.

But if you want to keep 0.3 semantics (which we might want to since COPY otherwise is quite
unusable), we just need to create destf directory in Hive.copyFiles(). 


> Load data inpath into a new partition without overwrite does not move the file
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-718
>                 URL: https://issues.apache.org/jira/browse/HIVE-718
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Zheng Shao
>         Attachments: HIVE-718.1.patch, HIVE-718.2.patch, hive-718.txt
>
>
> The bug can be reproduced as following. Note that it only happens for partitioned tables.
The select after the first load returns nothing, while the second returns the data correctly.
> insert.txt in the current local directory contains 3 lines: "a", "b" and "c".
> {code}
> > create table tmp_insert_test (value string) stored as textfile;
> > load data local inpath 'insert.txt' into table tmp_insert_test;
> > select * from tmp_insert_test;
> a
> b
> c
> > create table tmp_insert_test_p ( value string) partitioned by (ds string) stored
as textfile;
> > load data local inpath 'insert.txt' into table tmp_insert_test_p partition (ds =
'2009-08-01');
> > select * from tmp_insert_test_p where ds= '2009-08-01';
> > load data local inpath 'insert.txt' into table tmp_insert_test_p partition (ds =
'2009-08-01');
> > select * from tmp_insert_test_p where ds= '2009-08-01';
> a       2009-08-01
> b       2009-08-01
> d       2009-08-01
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message