hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marta Kuczora via Review Board <>
Subject Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs
Date Thu, 08 Mar 2018 16:52:54 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated March 8, 2018, 4:52 p.m.)

Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita.


Fixed review findings.

Bugs: HIVE-18696

Repository: hive-git


The idea behind the patch is

1) Separate the partition validation from starting the tasks which create the partition folders.

Instead of doing the checks on the partitions and submit the tasks in one loop, separated
the validation into a different loop. So first iterate through the partitions, validate the
table/db names, and check for duplicates. Then if all partitions were correct, in the second
loop submit the tasks to create the partition folders. This way if one of the partitions is
incorrect, the exception will be thrown in the first loop, before the tasks are submitted.
So we can be sure that no partition folder will be created if the list contains an invalid

2) Handle the exceptions which occur during the execution of the tasks differently.
Previously if an exception occured in one task, the remaining tasks were canceled, and the
newly created partition folders were cleaned up in the finally part. The problem was that
it could happen that some tasks were still not finished with the folder creation when cleaning
up the others, so there could have been leftover folders. After doing some testing it turned
out that this use case cannot be avoided completely when canceling the tasks.
The idea of this patch is to set a flag if an exception is thrown in one of the tasks. This
flag is visible in the tasks and if its value is true, the partition folders won't be created.
Then iterate through the remaining tasks and wait for them to finish. The tasks which are
started before the flag got set will then finish creating the partition folders. The tasks
which are started after the flag got set, won't create the partition folders, to avoid unnecessary
work. This way it is sure that all tasks are finished, when entering the finally part where
the partition folders are cleaned up.

Diffs (updated)

  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ 662de9a





Added some new tests cases to the TestAddPartitions and TestAddPartitionsFromPartSpec tests.


Marta Kuczora

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message