impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Attila Jeges (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION
Date Thu, 15 Sep 2016 18:50:26 GMT
Attila Jeges has posted comments on this change.

Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION

Patch Set 8:

File fe/src/main/java/com/cloudera/impala/analysis/

PS8, Line 20: import java.util.List;
            : import java.util.Set;
> nit: let's try to keep the import sorted if possible (move them below L 33)

PS8, Line 59: if (getDb() != null) {
            :       sb.append(getDb() + ".");
            :     }
> single line
File fe/src/main/java/com/cloudera/impala/analysis/

PS8, Line 3: import java.util.List;
           : import java.util.Set;
           : import java.util.ArrayList;
           : import java.util.Collections;
           : import java.util.Comparator;
> nit: unordered imports

PS8, Line 189: represantation
> typo: representation
File fe/src/main/java/com/cloudera/impala/service/

PS8, Line 375: // If 'IfNotExists' is false and there is a conflict with the partitions that
             :           // already exist in Hive, then:
             :           // - leave existing partitions intact
             :           // - don't add any new partitions to the table
             :           // - throw ImpalaRuntimeException.
             :           // If 'IfNotExists' is true, conflicts are hadled as follows:
             :           // - If some (but not all) partitions already exist in Hive, then
             :           // the partitions that already exist and add the rest.
             :           // - If all of the partitions already exist, ignore them and return
> I believe this comment section is almost duplicate to the one in alterTable

PS8, Line 1642: alterTableCachePartitions
> can you swap this function with alterTableAddPartitions? (top down reading 

PS8, Line 1686: hmsPartitions
> Will this try to alter all the partitions even though only one has a cache 
Changed the code to call alter_partitions() only with those partitions that should be cached.

I've added an E2E test (test_add_overlapping_partitions()) to to
test the scenario you described:

After ALTER TABLE Impala knows only about the newly added partitions. Partitions that have
already existed in HMS are silently ignored, they are not added to catalog cache and their
caching directives are ignored as well.

I thought that this is the expected behavior. The previous implementation of ALTER TABLE ADD
PARTITION behaved in a similar fashion. Another alter statement (ALTER TABLE name PARTITION(partition_spec)
SET) is used to change caching for an existing partition.

PS8, Line 1696: Hive
> nit: replace Hive with HMS

PS8, Line 1718: hmsPartitionValues2CachingOp
> An alternative for hmsPartitionValues2CachingOp would be partitionCachingOp

PS8, Line 1718: ImmutableList
> Why using ImmutableLists here? These variables are never exposed outside th
I guess, I wanted to communicate very directly and explicitly to the reader that the lists
won't change once they are in the map.

Got rid of ImmutableList.

PS8, Line 1720:  
> nit: extra space

Line 1727:       } else {
> You may add a "continue;" at the end of the if block and skip the "else"

PS8, Line 1733: ImmutableList.copyOf(hmsPartition.getValues()
> Partitions have unique ids, you don't need to use strings constructed from 
A unique id would work, but I couldn't find one in the documentation of org.apache.hadoop.hive.metastore.api.Partition

Maybe I am missing something?

PS8, Line 1745: catch (AlreadyExistsException e) {
              :         throw new ImpalaRuntimeException(
              :             String.format(HMS_RPC_ERROR_FORMAT_STR, "add_partitions"), e);
              :       } catch (TException e) {
              :         throw new ImpalaRuntimeException(
              :             String.format(HMS_RPC_ERROR_FORMAT_STR, "add_partitions"), e);
              :       }
> You don't really treat these two exceptions differently. Since Java 7, you 
According to the documentation add_partitions thows the following exceptions:

The first three are all inherited from TException.

Changed the code to catch TException only.

PS8, Line 1766: for (Partition partition: hmsPartitions) {
              :         // Create and add the HdfsPartition. Return the table object with
an updated
              :         // catalog version.
              :         addHdfsPartition(tbl, partition);
              :       }
> I think there is a scenario that may result in an inconsistent state. Say y
Correct, this is exactly what happens. I've added an E2E test (test_add_overlapping_partitions)
to to verify this.

Do you think this behavior is acceptable or we should do something about it?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <>
Gerrit-Reviewer: Attila Jeges <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-HasComments: Yes

View raw message