hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho" <>
Subject Re: Review Request 16938: HIVE-6209 'LOAD DATA INPATH ... OVERWRITE ..' doesn't overwrite current data
Date Wed, 22 Jan 2014 03:36:06 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Jan. 22, 2014, 3:36 a.m.)

Review request for hive.


Added a test as per the review comments.  This test used to fail before the patch, with the
second count(*) = 1000, now it is the correct value as 500.

Bugs: HIVE-6209

Repository: hive-git


There was a wrong condition introduced in HIVE-3756, that prevented load data overwrite from
working properly.  In these situations, destf == oldPath == /user/warehouse/hive/<tableName>,
so -rmr was skipped on old data.

Note that if file name was same, ie load data inpath '<path>' with same path repeatedly,
it would work as the rename would overwrite the old data file.  But in this case, the filename
is different.

Other minor changes are trying to improve logging in this area to better diagnose the issues
(for example file permission, etc).

Diffs (updated)

  ql/src/java/org/apache/hadoop/hive/ql/metadata/ 2fe86e1 
  ql/src/test/queries/clientpositive/load_fs_overwrite.q PRE-CREATION 
  ql/src/test/results/clientpositive/load_fs_overwrite.q.out PRE-CREATION 



The primary concern was whether removing the directory in these scenarios would make the rename
fail.  It should not due to fs.mkdirs call before, but I still verified the following scenarios:

load/insert overwrite into table with partitions
load/insert overwrite into table with buckets


Szehon Ho

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message