impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Attila Jeges (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION
Date Thu, 22 Sep 2016 16:11:53 GMT
Attila Jeges has posted comments on this change.

Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION

Patch Set 10:

File fe/src/main/java/com/cloudera/impala/analysis/

PS10, Line 75: params.setLocation
> Why are you setting the location if it's null? isn't this an optional field

PS10, Line 90: table_ instanceof HdfsTable)
> Is PartitionParams class applicable to other table types? If not, set a Pre
After going through the code once again, I realized that during the analysis, the Table object
is available through the 'analyzer' object, therefore PartitionParams.setTable() is not needed
at all.

The partitionSpec_.analyze() call ensures that the target table is an HdfsTable instance,
so it is safe to remove the checks from PartitionParams.analyze().
File fe/src/main/java/com/cloudera/impala/analysis/

PS10, Line 210: new ArrayList<PartitionKeyValue>(partitionSpec_);
> = Lists.newArrayList(partitionSpec_)?
File fe/src/main/java/com/cloudera/impala/service/

PS10, Line 1640: for partitions not yet in catalog cache, add cache directives
> I don't think the comment is accurate. We shouldn't by default add cache di
Correct, the comment was fixed.

PS10, Line 1644: TAlterTableAddPartitionParams addPartParams)
               :       throws ImpalaException {
> nit: can these two be in the same line?

PS10, Line 1651: hmsPartitions
> Maybe rename to "hmsPartitionsToAdd"?

PS10, Line 1653: new HashMap<List<String>, THdfsCachingOp>();
> Maps.newHashMap()?

PS10, Line 1657: partitionSpec)) {
> formatting nit: 2 more spaces are needed since this is part of a statement

PS10, Line 1673: partitionCachingOpMap.put(hmsPartition.getValues(), cacheOp);
> Why do you populate this map with all the new partitions? Not all the parti
Correct, if cacheOp is null it doesn't have to be added to the map.

I'm not sure I understood the 2nd part of your comment correctly, please clarify it for me
if I didn't:
If I create a list of Partitions that have a non-null cacheOp and pass that instead of 'hmsPartitions'
to alterTableCachePartitions(), then I still need to map partitions to their corresponding
cacheOps somehow. Otherwise I couldn't tell what cache pool name/replication to use when caching
a partition. Also note that if the table is cached, its partitions should be cached as well
when their cacheOp is null, so it is not safe to exclude them.

Here's what I did to simplify the code and get rid off partitionCachingOpMap:
Create a list of cacheOps for all the new partitions and pass that instead of 'partitionCachingOpMap'
to 'alterTableCachePartitions()' along with the list of new partitions. The list of cacheOps
and the list of new partitions are of equal size and their elements can be paired up to get
the partitions with their corresponding cacheOps.

PS10, Line 1689: hmsPartitions = getPartitionsFromHMS(msClient, tableName, hmsPartitions,
               :         addedHmsPartitions);
> Do we always need to call this function? What if the size of addedHmsPartit
Correct, no need to make the call in that case.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <>
Gerrit-Reviewer: Attila Jeges <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-HasComments: Yes

View raw message