From dev-return-146706-archive-asf-public=cust-asf.ponee.io@hive.apache.org Thu Mar 8 17:53:00 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 332DF18064C for ; Thu, 8 Mar 2018 17:53:00 +0100 (CET) Received: (qmail 37752 invoked by uid 500); 8 Mar 2018 16:52:58 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 37731 invoked by uid 99); 8 Mar 2018 16:52:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Mar 2018 16:52:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id B8FC2C02DA; Thu, 8 Mar 2018 16:52:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.69 X-Spam-Level: X-Spam-Status: No, score=0.69 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id bWdsCHC8USBx; Thu, 8 Mar 2018 16:52:56 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 139E75F254; Thu, 8 Mar 2018 16:52:55 +0000 (UTC) Received: from reviews.apache.org (unknown [10.41.0.12]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 205AEE00A7; Thu, 8 Mar 2018 16:52:54 +0000 (UTC) Received: from reviews-vm2.apache.org (localhost [IPv6:::1]) by reviews.apache.org (ASF Mail Server at reviews-vm2.apache.org) with ESMTP id 03C85C4043A; Thu, 8 Mar 2018 16:52:54 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============1702866210890325749==" MIME-Version: 1.0 Subject: Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs From: Marta Kuczora via Review Board To: Peter Vary , Adam Szita , Alexander Kolbasov Cc: hive , Sahil Takiar , Marta Kuczora Date: Thu, 08 Mar 2018 16:52:54 -0000 Message-ID: <20180308165254.62630.48580@reviews-vm2.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Marta Kuczora X-ReviewGroup: hive X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/65716/ X-Sender: Marta Kuczora References: <20180306173054.45969.9473@reviews-vm2.apache.org> In-Reply-To: <20180306173054.45969.9473@reviews-vm2.apache.org> Reply-To: Marta Kuczora X-ReviewRequest-Repository: hive-git --===============1702866210890325749== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65716/ ----------------------------------------------------------- (Updated March 8, 2018, 4:52 p.m.) Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita. Changes ------- Fixed review findings. Bugs: HIVE-18696 https://issues.apache.org/jira/browse/HIVE-18696 Repository: hive-git Description ------- The idea behind the patch is 1) Separate the partition validation from starting the tasks which create the partition folders. Instead of doing the checks on the partitions and submit the tasks in one loop, separated the validation into a different loop. So first iterate through the partitions, validate the table/db names, and check for duplicates. Then if all partitions were correct, in the second loop submit the tasks to create the partition folders. This way if one of the partitions is incorrect, the exception will be thrown in the first loop, before the tasks are submitted. So we can be sure that no partition folder will be created if the list contains an invalid partition. 2) Handle the exceptions which occur during the execution of the tasks differently. Previously if an exception occured in one task, the remaining tasks were canceled, and the newly created partition folders were cleaned up in the finally part. The problem was that it could happen that some tasks were still not finished with the folder creation when cleaning up the others, so there could have been leftover folders. After doing some testing it turned out that this use case cannot be avoided completely when canceling the tasks. The idea of this patch is to set a flag if an exception is thrown in one of the tasks. This flag is visible in the tasks and if its value is true, the partition folders won't be created. Then iterate through the remaining tasks and wait for them to finish. The tasks which are started before the flag got set will then finish creating the partition folders. The tasks which are started after the flag got set, won't create the partition folders, to avoid unnecessary work. This way it is sure that all tasks are finished, when entering the finally part where the partition folders are cleaned up. Diffs (updated) ----- standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 662de9a standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java 4d9cb1b standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java 1122057 Diff: https://reviews.apache.org/r/65716/diff/3/ Changes: https://reviews.apache.org/r/65716/diff/2-3/ Testing ------- Added some new tests cases to the TestAddPartitions and TestAddPartitionsFromPartSpec tests. Thanks, Marta Kuczora --===============1702866210890325749==--