impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimitris Tsirogiannis (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4762: RECOVER PARTITIONS should batch partition updates
Date Tue, 07 Mar 2017 00:09:32 GMT
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4762: RECOVER PARTITIONS should batch partition updates
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6275/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

PS1, Line 2641: // ifNotExists and needResults are true.
nit: move comment above L2644


PS1, Line 2646: } catch (AlreadyExistsException e) {
              :             throw new InternalException(
              :                 "AlreadyExistsException thrown although ifNotExists given",
e);
              :         }
I don't think we need to handle this case separately. Let the outer catch handle it. Besides,
we're not doing anything special here.


http://gerrit.cloudera.org:8080/#/c/6275/1/tests/metadata/test_recover_partitions.py
File tests/metadata/test_recover_partitions.py:

PS1, Line 171: test_recover_many_partitions
This test may be too expensive. How much time does it take to run? See my comments below for
ideas on how to trim it a bit.


PS1, Line 182: # Preload 10% of the partitions with partition(i) = values(i)
             :     for i in xrange(1, 700, 10):
             :         self.execute_query_expect_success(self.client,
             :             "INSERT INTO TABLE %s PARTITION(s='part%d') VALUES(%d)" % (FQ_TBL_NAME,
i, i))
I think we can live without this part.


PS1, Line 193: self.filesystem_client.create_file(TBL_LOCATION + PART_DIR + FILE_PATH,
             :                                            INSERTED_VALUE)
It should be sufficient to just create the directory, no?


PS1, Line 196: result = self.execute_query_expect_success(self.client,
             :         "SHOW PARTITIONS %s" % FQ_TBL_NAME)
             :     for i in xrange(1, 700):
             :         PART_DIR = "part%d\t" % i
             :         if (i % 10 == 1):
             :             assert self.has_value(PART_DIR, result.data)
             :         else:
             :             assert not self.has_value(PART_DIR, result.data)
             : 
             :     self.execute_query_expect_success(self.client,
             :         "ALTER TABLE %s RECOVER PARTITIONS" % FQ_TBL_NAME)
             :     result = self.execute_query_expect_success(self.client,
             :         "select c from %s" % FQ_TBL_NAME)
             :     for i in xrange(1, 700):
             :         if (i % 10 == 1):
             :             assert self.has_value(str(i), result.data)
             :         else:
             :             assert self.has_value(str(-i), result.data)
I would just do: a) show partitions and verify 0, b) recover partitions, c) show partitions
and verify 700.


-- 
To view, visit http://gerrit.cloudera.org:8080/6275
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f9334051b11ba8fa16159b7ca67ddc7f2392733
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <zamsden@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message