impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS <db> statement
Date Fri, 12 May 2017 23:08:48 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS <db> statement
......................................................................


Patch Set 1:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6878/1/fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
File fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java:

Line 30:  * Representation of a REFRESH/INVALIDATE METADATA statement.
Can you spell out the different variants to make this clearer?


Line 45:   public ResetMetadataStmt(TableName name, boolean isRefresh,
This c'tor is starting to look scary. Please make it private and add static create() functions
for the different use cases, e.g.:

createInvalidateStmt();
createRefreshStmt();
createRefreshFunctionsStmt();


http://gerrit.cloudera.org:8080/#/c/6878/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

Line 616:       List<org.apache.hadoop.hive.metastore.api.Function> javaFns =
Let's move as much of the heavy work as possible outside of the global lock, in particular,
those parts that talk to HMS.

We only need to acquire the catalogLock_ when once we access/update the DB in our cache.


Line 624:       Db db = dbCache_.get().get(dbName);
need to handle db == null


Line 627:       Db tmpDb = new Db(dbName, this, msDb);
I understand this is convenient, but it would be preferable to not RPC to HMS to get the msDb.
That's a rather high price which adds extra potential failure weirdness. Can't you just construct
a new msDb from the info we have in db?


Line 636:       for (String fn_name: tmpDb.getAllFunctions().keySet()) {
easier to read if you iterate over the entrySet()


Line 638:           boolean result = db.addFunction(fn);
We need to handle the !result case. We could have a totally different function with the same
name, and we need to update that function in db.

The easiest way is probably to first db.removeFunction() and then re-add if something was
removed.


Line 648:       for (String fn_name: db.getAllFunctions().keySet()) {
easier to read if you iterate over the entrySet()


Line 653:             toRemove.add(fn);
why not remove immediately here?


http://gerrit.cloudera.org:8080/#/c/6878/1/fe/src/main/java/org/apache/impala/catalog/Function.java
File fe/src/main/java/org/apache/impala/catalog/Function.java:

Line 318:     result.setCatalog_version(getCatalogVersion());
using catalogVersion_ is a little clearer to me


http://gerrit.cloudera.org:8080/#/c/6878/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 225:     if (req.is_delta) {
single line and remove the else block


http://gerrit.cloudera.org:8080/#/c/6878/1/tests/custom_cluster/test_permanent_udfs.py
File tests/custom_cluster/test_permanent_udfs.py:

Line 220:     REFRESH_COMMANDS = ["INVALIDATE METADATA",
Let's think a little about how to test these scenarios:
1. Hive dropped + re-added a different function with the same name. See that REFRESH returns
the new function.
2. Test that REFRESH works as expected for native functions.


-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message