Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 246CA10355 for ; Wed, 22 Jan 2014 03:36:12 +0000 (UTC) Received: (qmail 39722 invoked by uid 500); 22 Jan 2014 03:36:11 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 39495 invoked by uid 500); 22 Jan 2014 03:36:10 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 39484 invoked by uid 99); 22 Jan 2014 03:36:08 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Jan 2014 03:36:08 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id B12B91D4534; Wed, 22 Jan 2014 03:36:06 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============2230462036759618398==" MIME-Version: 1.0 Subject: Re: Review Request 16938: HIVE-6209 'LOAD DATA INPATH ... OVERWRITE ..' doesn't overwrite current data From: "Szehon Ho" To: "Szehon Ho" , "Prasad Mujumdar" , "Mohammad Islam" , "hive" Date: Wed, 22 Jan 2014 03:36:06 -0000 Message-ID: <20140122033606.28235.84442@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org Auto-Submitted: auto-generated Sender: "Szehon Ho" X-ReviewGroup: hive X-ReviewRequest-URL: https://reviews.apache.org/r/16938/ X-Sender: "Szehon Ho" References: <20140116014559.11940.64113@reviews.apache.org> In-Reply-To: <20140116014559.11940.64113@reviews.apache.org> Reply-To: "Szehon Ho" X-ReviewRequest-Repository: hive-git --===============2230462036759618398== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16938/ ----------------------------------------------------------- (Updated Jan. 22, 2014, 3:36 a.m.) Review request for hive. Changes ------- Added a test as per the review comments. This test used to fail before the patch, with the second count(*) = 1000, now it is the correct value as 500. Bugs: HIVE-6209 https://issues.apache.org/jira/browse/HIVE-6209 Repository: hive-git Description ------- There was a wrong condition introduced in HIVE-3756, that prevented load data overwrite from working properly. In these situations, destf == oldPath == /user/warehouse/hive/, so -rmr was skipped on old data. Note that if file name was same, ie load data inpath '' with same path repeatedly, it would work as the rename would overwrite the old data file. But in this case, the filename is different. Other minor changes are trying to improve logging in this area to better diagnose the issues (for example file permission, etc). Diffs (updated) ----- ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2fe86e1 ql/src/test/queries/clientpositive/load_fs_overwrite.q PRE-CREATION ql/src/test/results/clientpositive/load_fs_overwrite.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16938/diff/ Testing ------- The primary concern was whether removing the directory in these scenarios would make the rename fail. It should not due to fs.mkdirs call before, but I still verified the following scenarios: load/insert overwrite into table with partitions load/insert overwrite into table with buckets Thanks, Szehon Ho --===============2230462036759618398==--