impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimitris Tsirogiannis (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5538: Use explicit catalog versions for deleted objects
Date Wed, 20 Sep 2017 05:15:27 GMT
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5538: Use explicit catalog versions for deleted objects

Patch Set 5:

File be/src/catalog/

PS5, Line 231:   // If this is not a delta update, request an update from version 0 from the
             :   // catalog. There is an optimization that checks if pending_topic_updates_
was just
             :   // reloaded from version 0. If so, then skip this step and use that data.
> I found it impossible to understand this comment without reading through mo

PS5, Line 234: delta.from_version == 0 && delta.to_version == 0
> from the comment above, presumably this condition means "this is not a delt
Yeah, you're right. This condition is really confusing. from_version = 0 implies delta = false
and can be used here instead. The part that is not clear to me is the to_version check but
I couldn't find any case where it would make sense to check the to_version in order to decide
whether to send the full catalog update or not. I will remove it and see if it breaks something.

PS5, Line 310: if (catalog_object.catalog_version <= last_sent_catalog_version_) continue;
> given that last_sent_catalog_version_ was passed to GetAllCatalogObjects(),
Right, it's not needed. Replaced it with a DCHECK.
File be/src/catalog/catalog-server.h:

PS5, Line 150: added
> nit: May be 'added/updated/removed' to be consistent with other places?
File be/src/service/

PS5, Line 1392: 
> can we delete this function now?
Yes. Done

PS5, Line 1366: ;
> Could you add 'len' too, given this is trace logging anyway. (might be help

PS5, Line 1399: // Nothing to do here.
> Remove? Kinda seems obvious
File be/src/statestore/

PS5, Line 177:     entry_it->second.SetDeleted(true);
             :     entry_it->second.SetVersion(last_version_);
> now that some subscribes require deleted entries to have a value, how does 
The behavior is topic specific and currently only subscribers of catalog-update topic rely
on the value for deleted entries. Not sure how to make it less fragile in the sense that "value_"
is opaque to the statestore and is interpreted only by subscribers.

PS5, Line 490:  
> would be easier to read if you line break it there to keep the last arg all

PS5, Line 538: if (topic_entry.value() != Statestore::TopicEntry::NULL_VALUE)
> why do we need that guard? why can't we set topic_item.value even in that c
Yeap, removed it.

PS5, Line 546:         int64_t topic_size = topic.total_key_size_bytes() - deleted_key_size_bytes
             :             + topic.total_value_size_bytes();
> that math doesn't look correct anymore (deleted entries can have non-zero v
File be/src/statestore/statestore.h:

PS5, Line 180: may
> may or will?
"will". Changed it.
File common/thrift/CatalogInternalService.thrift:

PS5, Line 25: GetAllCatalogObjects
> see below. I think we should rename this.

PS5, Line 25: Contains all known Catalog objects
            : // (databases, tables/views, and functions) from the CatalogService's cache.
> I think that should be updated to explain that this is relative to a partic

PS5, Line 38: empty list if no objects detected in the Catalog
> Thanks. As we talked about in person, I think we should rename GetAllCatalo
File common/thrift/StatestoreService.thrift:

PS5, Line 87: If true, this item was deleted
> How about:

PS5, Line 87: If true, this item was deleted
> Actually, the parenthesis I suggested doesn't clarify.  Maybe that part sho

PS5, Line 97: List of changes to topic entries, including deletions.
> That's only true when is_delta==true.  So, how about we say:

PS5, Line 100: applied to in-memory state,
> that's kinda dependent on the subscriber's implementation, right? would it 
File fe/src/main/java/org/apache/impala/catalog/

PS5, Line 53: identiy
> nit: typo

PS5, Line 123: if (first.getType() != second.getType()) return false;
> This seems to already be handled by the next line (by generating a differen
Good point. Removed
File fe/src/main/java/org/apache/impala/catalog/

PS5, Line 318: TGetAllCatalogObjectsResponse resp = new TGetAllCatalogObjectsResponse();
             :     resp.setUpdated_objects(new ArrayList<TCatalogObject>());
             :     resp.setDeleted_objects(new ArrayList<TCatalogObject>());
             :     resp.setMax_catalog_version(Catalog.INITIAL_CATALOG_VERSION);
> Move all of this into getCatalogChanges() and make it return resp?

Line 350:    * to 'resp'.
> May be worth adding that this method expects that the global read lock is h

Line 369:           if (tbl.getCatalogVersion() > fromVersion) {
> Given we hold the global readLock() already (which means that the version c
It's racy if you do that. A concurrent table modification may have gotten a new version but
not assigned it yet to the table. If you read tbl version without holding the lock you don't
know if the table operation has finished or not hence, you don't know if it ok to reject this
object or not.

PS5, Line 374: trace
> This seems like a serious enough error to me, basically blocking updates on

Line 970:     }
> I see you made changes in the CatalogOpEx.drop*() to add the corresponding 
I thought about it but there are two reasons not to do it: a) we need the TCatalogObject which
is not created in the remove* functions and b) it's not only the remove* functions that record
the deleted objects (see execResetMetadata).
File fe/src/main/java/org/apache/impala/catalog/

Line 322:    *  Note that drop operations that come from statestore heartbeats always have
> Update?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I93cb7a033dc8f0d3e0339394b36affe14523274c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Vuk Ercegovac <>
Gerrit-HasComments: Yes

View raw message