hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-718) Load data inpath into a new partition without overwrite does not move the file
Date Thu, 20 Aug 2009 18:59:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745572#action_12745572
] 

Todd Lipcon commented on HIVE-718:
----------------------------------

Thanks for the explanation. I'll try to block out an hour to look through the code again this
afternoon or evening to see if I can fix this issue somehow or another.

There is one ugly solution whereby you can use a file (not a directory) as a lock. That is
to say, open a file "foo_load_lock". If it's already being created by another writer, the
NN will throw an IOException to the second guy, and we can enforce exclusive access. The problem
here is what to do with the failure scenario which might leave a lock hanging around for perpetuity
:(

> Load data inpath into a new partition without overwrite does not move the file
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-718
>                 URL: https://issues.apache.org/jira/browse/HIVE-718
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Zheng Shao
>         Attachments: HIVE-718.1.patch, HIVE-718.2.patch, hive-718.txt
>
>
> The bug can be reproduced as following. Note that it only happens for partitioned tables.
The select after the first load returns nothing, while the second returns the data correctly.
> insert.txt in the current local directory contains 3 lines: "a", "b" and "c".
> {code}
> > create table tmp_insert_test (value string) stored as textfile;
> > load data local inpath 'insert.txt' into table tmp_insert_test;
> > select * from tmp_insert_test;
> a
> b
> c
> > create table tmp_insert_test_p ( value string) partitioned by (ds string) stored
as textfile;
> > load data local inpath 'insert.txt' into table tmp_insert_test_p partition (ds =
'2009-08-01');
> > select * from tmp_insert_test_p where ds= '2009-08-01';
> > load data local inpath 'insert.txt' into table tmp_insert_test_p partition (ds =
'2009-08-01');
> > select * from tmp_insert_test_p where ds= '2009-08-01';
> a       2009-08-01
> b       2009-08-01
> d       2009-08-01
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message