impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Huaisi Xu (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3499: Batch update catalog cache update.
Date Wed, 01 Jun 2016 19:56:14 GMT
Huaisi Xu has posted comments on this change.

Change subject: IMPALA-3499: Batch update catalog cache update.
......................................................................


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/3067/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 1270:         LOG(ERROR) << "Error deserializing item: " << status.GetDetail();
> don't you need a continue after this?
This is not in for loop. This is to get the catalog object with catalog_service_id(default
0) only.. we need this in split update otherwise frontend detects catalog service id changes
and requests a full update. it will fail eventually.


Line 1278: TUpdateCatalogCacheResponse resp;
> Similarly, consider moving this inside loop on line 1280, and redeclaring t
ok


Line 1281: len = item.value.size();
> My point is that there's no relationship between the len used in line 1266 
ok


Line 1299: update_size + len
> Ah, I understand the flow of the code a bit better now. I think a comment t
Do you mean that we take the lock no matter how large is the update? if update is small(usually
the case), we do not need this lock I think. Front end synchronize that for us?


Line 1354: if (update_status.ok()) {
         :       update_status = exec_env_->frontend()->UpdateCatalogCache(update_req,
&resp);
         :     }
> That's subtle, and hard for the reader to understand. Please consider just 
I am not sure how... because when at line 1354, the code does not know if any split update
happens. either yes or no, we do not re-acquire locks here. not sure what do you mean by "just
holding the lock for the duration of this method."


Line 1357: if (catalog_update_lock.owns_lock()) catalog_update_lock.unlock();
> Has it been faster in any of your tests?
not tried to test this... just thought to be more explicit. you think I'd better remove this?


http://gerrit.cloudera.org:8080/#/c/3067/5/fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java:

Line 114: last_batch_update
> not clear from the name what condition this represents.
update_with_CATALOG_TCatalogObjectType?


Line 129: if (catalogObject.getType() == TCatalogObjectType.CATALOG) {
        :         newCatalogVersion = catalogObject.getCatalog_version();
        :         last_batch_update = true;
        :       }
> Not clear how this works - will one of the updates *always* contain a CATAL
there is only one catalog object(has service id) at the end for every statestore update. if
we split that, intermediate update wont have this object.(we still have the service id though)


-- 
To view, visit http://gerrit.cloudera.org:8080/3067
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I176db25124a32944f2396ce8aafbed49cac95928
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Huaisi Xu <hxu@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hxu@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message