impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5058: Improve the concurrency of DDL/DML operations
Date Wed, 06 Dec 2017 00:28:49 GMT
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8752 )

Change subject: IMPALA-5058: Improve the concurrency of DDL/DML operations
......................................................................


Patch Set 1:

(25 comments)

Here's a first wave of comments. Still thinking about some places in more detail.

http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@328
PS1, Line 328:   /// waits all other catalog topic subscribers to process this update by checking
the
waits for


http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@339
PS1, Line 339:   void WaitForMinCatalogUpdate(const int64_t min_catalog_update_version,
Add comment


http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@344
PS1, Line 344:   void WaitForCatalogUpdateTopicPropagation(const TUniqueId& catalog_service_id);
Let's pass in the version to keep these Wait() APIs consistent.


http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift
File common/thrift/CatalogInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift@39
PS1, Line 39:   // List of updated (new and modified) catalog objects for which the catalog
verion is
catalog objects whose version is larger than ...


http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift@43
PS1, Line 43:   // List of deleted catalog objects for which the catalog version is larger
than
catalog objects whose version is larger than ...


http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift
File common/thrift/StatestoreService.thrift:

http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift@83
PS1, Line 83:   // subscribers need additional information in order to process the deleted
topics that
This is needed when subscribers require additional information not captured in the item key
to process the deleted item (e.g., catalog version of deleted catalog object).


http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift@88
PS1, Line 88:   // non-delta TTopicDelta's (since the latest version of every still-present
topic will
* Remove parenthesis since the explanation is important
* Do you mean "still-present item" instead of "still-present topic"?

The sentence is somewhat confusing to me, maybe I'm misunderstanding. A non-delta update must
include all items so it should also include this one, right?


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java@91
PS1, Line 91:       for (RolePrivilege priv: existingRole.getPrivileges()) {
might simplify some places to have bulk versions of the add/remove calls like addAll and removeAll
that take a collection of CatalogObjects


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@544
PS1, Line 544:       throws IllegalStateException {
remove throws since it's a RuntimeException


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@577
PS1, Line 577:    * TODO: Use global object IDs everywhere instead of tracking catalog objects
by
Seems like a pretty big TODO unlikely to get addressed. I suggest removing it.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@580
PS1, Line 580:   public static boolean objectNamesMatch(TCatalogObject first, TCatalogObject
second) {
keyEquals() or similar since we are actually checking for key equality and not "object name"
equality


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@37
PS1, Line 37:  *   direct ("fast") update that applies the result of a catalog operation to
the cache
remove ("fast")


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@53
PS1, Line 53:  *   The catalogd uses this log to identify deleted catalog objects. Deleted
... to identify objects that have been deleted since the last catalog topic update.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@55
PS1, Line 55:  *   them (e.g. dropTable()). While constructing a catalog update topic, we
use the log to
Sentence in the middle with "While" can be removed imo. The first sentence already gives this
info.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@57
PS1, Line 57:  *   Once the catalog topic update is constructed, the old deleted catalog objects
are
the old deleted catalog objects -> the old entries in this log


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java
File fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java@28
PS1, Line 28:   private AtomicLong catalogVersion_ = new AtomicLong(Catalog.INITIAL_CATALOG_VERSION);
Not clear why this has to be atomic, but let's think about that outside of this patch. Commenting
here so we don't forget.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java
File fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@39
PS1, Line 39:   public void updateCatalogObjectVersions(long oldVersion, long newVersion)
{
Method names seem verbose, how about just add/remove/update


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@53
PS1, Line 53:   public long getMinimumObjectVersion() {
getMinimumVersion()


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@95
PS1, Line 95:  * CatalogServiceCatalog guarantees that very catalog object that has a version
in the
typo: that every

Sentence does not seem accurate. I think we guarantee that all objects included in the update
are within that version range - but we don't necessarily include *all* such objects. Some
objects are skipped to allow the topic update thread to make progress.

We should also mention that the topic update itself is assigned a version (the "to" version).


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@97
PS1, Line 97:  * does not prevent DDL requests from making progress while the catalog topic
update is
Let's phrase this positively: Concurrent DDL requests are allowed while a topic update is
in progress.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@98
PS1, Line 98:  * created. Hence, there is a non-zero probability that frequently modified
catalog
Concept of skipping has not been introduced, need to discuss that above when talking about
the nuances of version range, etc.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@103
PS1, Line 103:  * higher than the 'to' version of the topic update.
As a result, the same version of an object might be sent in two subsequent topic updates.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@127
PS1, Line 127:  *   The time-based cleanup process of the topic update log entries may cause
metadata
Give this section a title like "Known Anomalies with SYNC_DDL"


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@138
PS1, Line 138:  *   waitFroSyncDdlVersion() for more details.
typo: waitForSyncDdlVersion()


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@224
PS1, Line 224:   private class TopicUpdateLog {
Can you move this into a separate .java file. This one is already getting huge.



-- 
To view, visit http://gerrit.cloudera.org:8080/8752
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If12467a83acaeca6a127491d89291dedba91a35a
Gerrit-Change-Number: 8752
Gerrit-PatchSet: 1
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Wed, 06 Dec 2017 00:28:49 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message