Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 377222009E2 for ; Wed, 1 Jun 2016 21:56:21 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 35F85160A4D; Wed, 1 Jun 2016 19:56:21 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7D86D160A45 for ; Wed, 1 Jun 2016 21:56:20 +0200 (CEST) Received: (qmail 89563 invoked by uid 500); 1 Jun 2016 19:56:19 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 89552 invoked by uid 99); 1 Jun 2016 19:56:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jun 2016 19:56:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 012981804DE for ; Wed, 1 Jun 2016 19:56:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id syi__rQ6g0_y for ; Wed, 1 Jun 2016 19:56:16 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 2240F5F24D for ; Wed, 1 Jun 2016 19:56:16 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u51JuEd7012928; Wed, 1 Jun 2016 19:56:14 GMT Message-Id: <201606011956.u51JuEd7012928@ip-10-146-233-104.ec2.internal> Date: Wed, 1 Jun 2016 19:56:14 +0000 From: "Huaisi Xu (Code Review)" To: impala-cr@cloudera.com, dev@impala.incubator.apache.org CC: Henry Robinson , Dimitris Tsirogiannis , Tim Armstrong , Dan Hecht Reply-To: hxu@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?[Impala-CR](cdh5-trunk)_IMPALA-3499:_Batch_update_catalog_cache_update.=0A?= X-Gerrit-Change-Id: I176db25124a32944f2396ce8aafbed49cac95928 X-Gerrit-ChangeURL: X-Gerrit-Commit: babaa40fe549d6736c4286e0abf5acffec36272a In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.10-rc0 archived-at: Wed, 01 Jun 2016 19:56:21 -0000 Huaisi Xu has posted comments on this change. Change subject: IMPALA-3499: Batch update catalog cache update. ...................................................................... Patch Set 5: (8 comments) http://gerrit.cloudera.org:8080/#/c/3067/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 1270: LOG(ERROR) << "Error deserializing item: " << status.GetDetail(); > don't you need a continue after this? This is not in for loop. This is to get the catalog object with catalog_service_id(default 0) only.. we need this in split update otherwise frontend detects catalog service id changes and requests a full update. it will fail eventually. Line 1278: TUpdateCatalogCacheResponse resp; > Similarly, consider moving this inside loop on line 1280, and redeclaring t ok Line 1281: len = item.value.size(); > My point is that there's no relationship between the len used in line 1266 ok Line 1299: update_size + len > Ah, I understand the flow of the code a bit better now. I think a comment t Do you mean that we take the lock no matter how large is the update? if update is small(usually the case), we do not need this lock I think. Front end synchronize that for us? Line 1354: if (update_status.ok()) { : update_status = exec_env_->frontend()->UpdateCatalogCache(update_req, &resp); : } > That's subtle, and hard for the reader to understand. Please consider just I am not sure how... because when at line 1354, the code does not know if any split update happens. either yes or no, we do not re-acquire locks here. not sure what do you mean by "just holding the lock for the duration of this method." Line 1357: if (catalog_update_lock.owns_lock()) catalog_update_lock.unlock(); > Has it been faster in any of your tests? not tried to test this... just thought to be more explicit. you think I'd better remove this? http://gerrit.cloudera.org:8080/#/c/3067/5/fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java: Line 114: last_batch_update > not clear from the name what condition this represents. update_with_CATALOG_TCatalogObjectType? Line 129: if (catalogObject.getType() == TCatalogObjectType.CATALOG) { : newCatalogVersion = catalogObject.getCatalog_version(); : last_batch_update = true; : } > Not clear how this works - will one of the updates *always* contain a CATAL there is only one catalog object(has service id) at the end for every statestore update. if we split that, intermediate update wont have this object.(we still have the service id though) -- To view, visit http://gerrit.cloudera.org:8080/3067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176db25124a32944f2396ce8aafbed49cac95928 Gerrit-PatchSet: 5 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes