impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikramjeet Vig (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-1683: Allow REFRESH on a single partition
Date Fri, 08 Jul 2016 22:07:37 GMT
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-1683: Allow REFRESH on a single partition

Patch Set 5:

File fe/src/main/java/com/cloudera/impala/analysis/

PS5, Line 42: Preconditions.checkArgument(!isRefresh || name != null);
            :     Preconditions.checkArgument(partitionSpec == null || name != null);
> Can't we combine these two into a single check?
File fe/src/main/java/com/cloudera/impala/catalog/

PS5, Line 1164: To achieve that, it drops the partition
              :    * from the table object, fetches a fresh copy of the partition from Hive
              :    * metastore and adds it back to the table.
> We no longer do that in this function, so you may want to update the commen

PS5, Line 1174: Db db = tbl.getDb();
> This seems to be used first time in L1195. Let's move it closer to that.
These 3 variables can be moved to L1195 but I kept them here because I wanted to minimize
amount of code inside critical sections(while holding locks)

PS5, Line 1175: HdfsPartition refreshedPartition = null;
> Where is this variable used?

PS5, Line 1182: // For the case where partition does not exist in Catalog cache but might
              :       // exist in Hive Metastore
> I am not sure I follow the comment. I believe it's more accurate to say tha

PS5, Line 1185: getPartitionNameFromThriftPartitionSpec(
> I think it's a bit more clear to call this function constructPartitionName(

PS5, Line 1187: %s",
              :           tbl.getFullName() + " " + partitionName));
> "%s %s", tbl.getFullName(), partitionName);

PS5, Line 1189: long newCatalogVersion = incrementAndGetCatalogVersion();
              :       catalogLock_.writeLock().unlock();
> Move this above L1180. We don't want to hold the catalog lock longer than n

Line 1193:         org.apache.hadoop.hive.metastore.api.Partition partition = null;
> I would add a comment here saying: "If the partition does not exist in Hive

PS5, Line 1206:     + tblName.getTable_name() + " " + partitionName,
              :               e);
> nit: merge these two lines
File fe/src/main/java/com/cloudera/impala/catalog/

PS5, Line 1944: data
> metadata (files and blocks)

PS5, Line 1948: dropPartition(partitionSpec);
> We have a dropPartition function that takes as input a Partition object, so
the dropPartition function takes in a HdfsPartition type object but a "org.apache.hadoop.hive.metastore.api.Partition"
type object is being passed to this method. The only option was to use the dropPartition method
that took in the partitionSpec object. If I do not pass the partition spec object, I ll have
to write a helper method to create that object from the Partition object.

Also, the dropPartition method takes in the HdfsPartition that is already in catalog to remove
the stats and block data. If I pass it the new HdfsPartition object that I created, that will
mess up all the metadata.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I4bf5dd630f2e175be5800187bbf9337c91f82bd8
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Bikramjeet Vig <>
Gerrit-Reviewer: Bikramjeet Vig <>
Gerrit-Reviewer: David Knupp <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-HasComments: Yes

View raw message