impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philip Zeyliger (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION.
Date Mon, 16 Oct 2017 18:43:01 GMT
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8238
)

Change subject: IMPALA-4524: Batch calls to ALTER TABLE...ADD PARTITION.
......................................................................


Patch Set 4:

(2 comments)

Thanks for the review!

I ran the core tests. Added a python test, as you suggested.

http://gerrit.cloudera.org:8080/#/c/8238/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/8238/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@a1954
PS3, Line 1954: 
> Why did you remove this?
Mistake. Re-instated.

(I missed that MetaStoreClient was autocloseable...)


http://gerrit.cloudera.org:8080/#/c/8238/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1974
PS3, Line 1974: // HMS because some of them may already exist there. In that case, we load
in the
              :       // catalog the partitions that already exist in HMS but aren't in the
catalog yet.
              :       if (allHmsPartitionsToAdd.size() != addedHmsPartitions.size()) {
              :         List<Partition> difference = computeDifference(allHmsPartitionsToAdd,
              :             addedHmsPartitions);
              :         addedHmsPartitions.addAll(
              :             getPartitionsFromHms(msTbl, msClient, tableName, difference));
              :       }
              : 
              :       for (Partition partition: addedHmsPartitions) {
              :         // Create and add the HdfsPartition to catalog. Return the table object
with an
              :        
> I think you can refactor this so that you make only one call to getPartitio
Sure. I tightened the partitioned loop to only include the msClient call.

Note that this implies that getPartitionFromHms() is ok to call with a big (> MAX_PARTITION_UPDATES_PER_RPC)
batch. Is that against the spirit here?



-- 
To view, visit http://gerrit.cloudera.org:8080/8238
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95f8221ff08c0f126f951f7d37ff5e57985f855f
Gerrit-Change-Number: 8238
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger <philip@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <philip@cloudera.com>
Gerrit-Comment-Date: Mon, 16 Oct 2017 18:43:01 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message