Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5D13F200D5B for ; Wed, 13 Dec 2017 23:34:58 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 5BB2B160C23; Wed, 13 Dec 2017 22:34:58 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 77F12160C0F for ; Wed, 13 Dec 2017 23:34:57 +0100 (CET) Received: (qmail 62654 invoked by uid 500); 13 Dec 2017 22:34:56 -0000 Mailing-List: contact reviews-help@impala.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.apache.org Received: (qmail 62641 invoked by uid 99); 13 Dec 2017 22:34:56 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Dec 2017 22:34:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id EC2F2180895 for ; Wed, 13 Dec 2017 22:34:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.362 X-Spam-Level: ** X-Spam-Status: No, score=2.362 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id HZsRpFzd2rMX for ; Wed, 13 Dec 2017 22:34:52 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 13C895F19D for ; Wed, 13 Dec 2017 22:34:51 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id vBDMYolu028996; Wed, 13 Dec 2017 22:34:50 GMT Message-Id: <201712132234.vBDMYolu028996@ip-10-146-233-104.ec2.internal> X-Gerrit-PatchSet: 3 Date: Wed, 13 Dec 2017 22:34:50 +0000 From: "Bharath Vissapragada (Code Review)" To: Dimitris Tsirogiannis , impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Philip Zeyliger , Vuk Ercegovac X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_=5BPREVIEW=5D_IMPALA-4886=3A_Expose_table_metrics_in_the_catalog_web_UI=2E=0A?= X-Gerrit-Change-Id: I37d407979e6d3b1a444b6b6265900b148facde9e X-Gerrit-Change-Number: 8529 X-Gerrit-ChangeURL: X-Gerrit-Commit: 98d18cfcec8db0eabcaa3810808160dd6d5ad49a In-Reply-To: References: X-Gerrit-Comment-Date: Wed, 13 Dec 2017 22:34:50 +0000 Reply-To: bharathv@cloudera.com, impala-cr@cloudera.com, marcelk@gmail.com, philip@cloudera.com, reviews@impala.incubator.apache.org, dtsirogiannis@cloudera.com, vercegovac@cloudera.com MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.14.2 Content-Type: multipart/alternative; boundary="hqaifSPj6Zw="; charset=UTF-8 archived-at: Wed, 13 Dec 2017 22:34:58 -0000 --hqaifSPj6Zw= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Bharath Vissapragada has posted comments on this change=2E ( http://gerrit= =2Ecloudera=2Eorg:8080/8529 ) Change subject: [PREVIEW] IMPALA-4886: Expos= e table metrics in the catalog web UI=2E =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E Patch Set 3: (20 comments) The patch mostly= lgtm=2E I have some minor suggestions before I can +1=2E http://gerrit=2E= cloudera=2Eorg:8080/#/c/8529/3//COMMIT_MSG Commit Message: http://gerrit= =2Ecloudera=2Eorg:8080/#/c/8529/3//COMMIT_MSG@7 PS3, Line 7: [PREVIEW] Remo= ve? http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/2/be/src/catalog/catalog= -server=2Ecc File be/src/catalog/catalog-server=2Ecc: http://gerrit=2Eclou= dera=2Eorg:8080/#/c/8529/2/be/src/catalog/catalog-server=2Ecc@359 PS2, Line= 359: GetCatalogUsage(document); > In principal, that's a good idea=2E I = plan on doing it in the future for more Makes sense=2E If you don't mind, c= an you please raise jiras for these to-dos so that we don't lose track of t= hem=2E Also, they sound like great "ramp-up" tasks, that others can pick up= =2E http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/2/be/src/catalog/catalog= -server=2Ecc@492 PS2, Line 492: // TODO: Enable json view of table metric= s > Good idea=2E Left a TODO=2E I think thats a one-liner looking at other = places in the code=2E document->AddMember(Webserver::ENABLE_RAW_JSON_KEY, = true, document->GetAllocator()); http://gerrit=2Ecloudera=2Eorg:8080/#/c/= 8529/3/be/src/catalog/catalog-server=2Ecc File be/src/catalog/catalog-serve= r=2Ecc: http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/be/src/catalog/cata= log-server=2Ecc@359 PS3, Line 359: GetCatalogUsage(document); Check the r= eturn value and return early if it threw a failure? if (!GetCatalogUsage()= =2Eok()) return; http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/be/src/ca= talog/catalog-server=2Ecc@516 PS3, Line 516: object_name name? http://ger= rit=2Ecloudera=2Eorg:8080/#/c/8529/2/be/src/catalog/catalog=2Eh File be/src= /catalog/catalog=2Eh: http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/2/be/sr= c/catalog/catalog=2Eh@96 PS2, Line 96: Status GetCatalogUsage(TGetCatalog= UsageResponse* response); > I renamed the other one to GetCatalogUsage=2E I= thought of keeping it a bit m Agree=2E http://gerrit=2Ecloudera=2Eorg:80= 80/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/Catalog=2Ejava Fil= e fe/src/main/java/org/apache/impala/catalog/Catalog=2Ejava: http://gerrit= =2Ecloudera=2Eorg:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalo= g/Catalog=2Ejava@166 PS3, Line 166: if (tbl !=3D null) CatalogUsageMoni= tor=2EINSTANCE=2EremoveTable(tbl); I think this is only executed on the Cat= alog service side and it looks to me like the CatalogUsageMonitor is popula= ted on the coordinators too=2E May be we should just restrict it to the cat= alogd? http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/2/fe/src/main/java/or= g/apache/impala/catalog/CatalogUsageMonitor=2Ejava File fe/src/main/java/or= g/apache/impala/catalog/CatalogUsageMonitor=2Ejava: http://gerrit=2Ecloude= ra=2Eorg:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/Catalog= UsageMonitor=2Ejava@47 PS2, Line 47: true > I wanted to avoid the scenario = where 25 tables that were accessed in the pa Oh, I see your point=2E It can= be argued both ways=2E http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/fe= /src/main/java/org/apache/impala/catalog/HdfsPartition=2Ejava File fe/src/m= ain/java/org/apache/impala/catalog/HdfsPartition=2Ejava: http://gerrit=2Ec= loudera=2Eorg:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/Hd= fsPartition=2Ejava@934 PS3, Line 934: thriftHdfsPart=2EsetHas_increment= al_stats(hasIncrementalStats()); IMO, we should also add the incremental st= ats size estimate by computing the obj sizes of the params map (if incr sta= ts are present of course)=2E http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529= /3/fe/src/main/java/org/apache/impala/catalog/HdfsTable=2Ejava File fe/src/= main/java/org/apache/impala/catalog/HdfsTable=2Ejava: http://gerrit=2Eclou= dera=2Eorg:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsT= able=2Ejava@196 PS3, Line 196: =2E=2Eany of the partitions in=2E=2E ? h= ttp://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/fe/src/main/java/org/apache/i= mpala/catalog/HdfsTable=2Ejava@1720 PS3, Line 1720: metadataSizeEstimate ma= y be memUsageEstimate to be inline with CatalogUsage=2E=2Eterminology? ht= tp://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/fe/src/main/java/org/apache/im= pala/catalog/HdfsTable=2Ejava@1743 PS3, Line 1743: stats=2EnumBlocks +=3D = tHdfsPartition=2EgetNum_blocks(); : stats=2EnumFile= s +=3D : tHdfsPartition=2EisSetFile_desc() ? tH= dfsPartition=2EgetFile_desc()=2Esize() : 0; : stats= =2EtotalFileBytes +=3D tHdfsPartition=2EgetTotal_file_size_bytes(); Shouldn= 't these be populated irrespective of "includeFileDesc" since they account = for memory usage? http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/fe/src/m= ain/java/org/apache/impala/catalog/Table=2Ejava File fe/src/main/java/org/a= pache/impala/catalog/Table=2Ejava: http://gerrit=2Ecloudera=2Eorg:8080/#/c= /8529/3/fe/src/main/java/org/apache/impala/catalog/Table=2Ejava@151 PS3, Li= ne 151: public Metrics getMetrics() { return metrics_; } Preconditions=2E= checkState(tableLock_=2EisHeldByCurrentThread())? http://gerrit=2Eclouder= a=2Eorg:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/common/Metrics= =2Ejava File fe/src/main/java/org/apache/impala/common/Metrics=2Ejava: htt= p://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/2/fe/src/main/java/org/apache/imp= ala/common/Metrics=2Ejava@143 PS2, Line 143: > Not sure I follow=2E Why l= e? I just want to print the % to indicate percentil I thought % is used for= percentage and %le or %tile for percentile to differentiate=2E I'm not too= strong about this=2E http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/fe/s= rc/main/java/org/apache/impala/service/CatalogOpExecutor=2Ejava File fe/src= /main/java/org/apache/impala/service/CatalogOpExecutor=2Ejava: http://gerr= it=2Ecloudera=2Eorg:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/serv= ice/CatalogOpExecutor=2Ejava@362 PS3, Line 362: final Timer=2EContext c= ontext Include the lock in the context? http://gerrit=2Ecloudera=2Eorg:80= 80/#/c/8529/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor= =2Ejava@3334 PS3, Line 3334: */ Maybe add a comment that this increments= the opsCount, otherwise, some callers might call it repeatedly resulting i= n spurious counts=2E http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/fe/sr= c/main/java/org/apache/impala/util/TopNCache=2Ejava File fe/src/main/java/o= rg/apache/impala/util/TopNCache=2Ejava: http://gerrit=2Ecloudera=2Eorg:808= 0/#/c/8529/3/fe/src/main/java/org/apache/impala/util/TopNCache=2Ejava@56 PS= 3, Line 56: function_ =3D f; Preconditions=2EcheckState(maxCapacity > 0= )? http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/fe/src/test/java/org/ap= ache/impala/util/TestTopNCache=2Ejava File fe/src/test/java/org/apache/impa= la/util/TestTopNCache=2Ejava: http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529= /3/fe/src/test/java/org/apache/impala/util/TestTopNCache=2Ejava@39 PS3, Lin= e 39: TopNCache cache =3D : new TopNCac= he(new Function() { : @= Override : public Long apply(Long element) { re= turn element; } : }, capacity, policy); = : // populate with more values that the specified max capacity = : for (long i =3D 0; i < capacity * 2; i++) cache=2EputOrU= pdate(i); : assertEquals(cache=2ElistEntries()=2Esize()= , capacity); I think the TopNCache creation with a specified evictionpolicy= and insert types (random/sequential) can be factored out into a helper=2E = http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3/fe/src/test/java/org/apach= e/impala/util/TestTopNCache=2Ejava@52 PS3, Line 52: Also assert the rankin= g by randomizing the puts? http://gerrit=2Ecloudera=2Eorg:8080/#/c/8529/3= /www/catalog=2Etmpl File www/catalog=2Etmpl: http://gerrit=2Ecloudera=2Eor= g:8080/#/c/8529/3/www/catalog=2Etmpl@135 PS3, Line 135: {{name}}-metrics add some uni= t tests for this? (test_web_pages=2Epy) -- To view, visit http://gerrit= =2Ecloudera=2Eorg:8080/8529 To unsubscribe, visit http://gerrit=2Ecloudera= =2Eorg:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerr= it-MessageType: comment Gerrit-Change-Id: I37d407979e6d3b1a444b6b6265900b14= 8facde9e Gerrit-Change-Number: 8529 Gerrit-PatchSet: 3 Gerrit-Owner: Dimitr= is Tsirogiannis Gerrit-Reviewer: Bharath Vis= sapragada Gerrit-Reviewer: Dimitris Tsirogiannis = Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Ger= rit-Comment-Date: Wed, 13 Dec 2017 22:34:50 +0000 Gerrit-HasComments: Yes --hqaifSPj6Zw=--