impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath Vissapragada (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] [PREVIEW] IMPALA-4886: Expose table metrics in the catalog web UI.
Date Wed, 13 Dec 2017 22:34:50 GMT
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/8529
)

Change subject: [PREVIEW] IMPALA-4886: Expose table metrics in the catalog web UI.
......................................................................


Patch Set 3:

(20 comments)

The patch mostly lgtm. I have some minor suggestions before I can +1.

http://gerrit.cloudera.org:8080/#/c/8529/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8529/3//COMMIT_MSG@7
PS3, Line 7: [PREVIEW]
Remove?


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.cc@359
PS2, Line 359:   GetCatalogUsage(document);
> In principal, that's a good idea. I plan on doing it in the future for more
Makes sense. If you don't mind, can you please raise jiras for these to-dos so that we don't
lose track of them. Also, they sound like great "ramp-up" tasks, that others can pick up.


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.cc@492
PS2, Line 492:   // TODO: Enable json view of table metrics
> Good idea. Left a TODO.
I think thats a one-liner looking at other places in the code.

document->AddMember(Webserver::ENABLE_RAW_JSON_KEY, true, document->GetAllocator());


http://gerrit.cloudera.org:8080/#/c/8529/3/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/8529/3/be/src/catalog/catalog-server.cc@359
PS3, Line 359:   GetCatalogUsage(document);
Check the return value and return early if it threw a failure?

if (!GetCatalogUsage().ok()) return;


http://gerrit.cloudera.org:8080/#/c/8529/3/be/src/catalog/catalog-server.cc@516
PS3, Line 516: object_name
name?


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog.h
File be/src/catalog/catalog.h:

http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog.h@96
PS2, Line 96:   Status GetCatalogUsage(TGetCatalogUsageResponse* response);
> I renamed the other one to GetCatalogUsage. I thought of keeping it a bit m
Agree.


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java@166
PS3, Line 166:     if (tbl != null) CatalogUsageMonitor.INSTANCE.removeTable(tbl);
I think this is only executed on the Catalog service side and it looks to me like the CatalogUsageMonitor
is populated on the coordinators too. May be we should just restrict it to the catalogd?


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
File fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java@47
PS2, Line 47: true
> I wanted to avoid the scenario where 25 tables that were accessed in the pa
Oh, I see your point. It can be argued both ways.


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@934
PS3, Line 934:     thriftHdfsPart.setHas_incremental_stats(hasIncrementalStats());
IMO, we should also add the incremental stats size estimate by computing the obj sizes of
the params map (if incr stats are present of course).


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@196
PS3, Line 196:  
..any of the partitions in.. ?


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1720
PS3, Line 1720: metadataSizeEstimate
may be memUsageEstimate to be inline with CatalogUsage..terminology?


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1743
PS3, Line 1743:  stats.numBlocks += tHdfsPartition.getNum_blocks();
              :           stats.numFiles +=
              :               tHdfsPartition.isSetFile_desc() ? tHdfsPartition.getFile_desc().size()
: 0;
              :           stats.totalFileBytes += tHdfsPartition.getTotal_file_size_bytes();
Shouldn't these be populated irrespective of "includeFileDesc" since they account for memory
usage?


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/Table.java@151
PS3, Line 151:   public Metrics getMetrics() { return metrics_; }
Preconditions.checkState(tableLock_.isHeldByCurrentThread())?


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/common/Metrics.java
File fe/src/main/java/org/apache/impala/common/Metrics.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/common/Metrics.java@143
PS2, Line 143:  
> Not sure I follow. Why le? I just want to print the % to indicate percentil
I thought % is used for percentage and %le or %tile for percentile to differentiate. I'm not
too strong about this.


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@362
PS3, Line 362:     final Timer.Context context
Include the lock in the context?


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3334
PS3, Line 3334:    */
Maybe add a comment that this increments the opsCount, otherwise, some callers might call
it repeatedly resulting in spurious counts.


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/util/TopNCache.java
File fe/src/main/java/org/apache/impala/util/TopNCache.java:

http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/util/TopNCache.java@56
PS3, Line 56:     function_ = f;
Preconditions.checkState(maxCapacity > 0)?


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/test/java/org/apache/impala/util/TestTopNCache.java
File fe/src/test/java/org/apache/impala/util/TestTopNCache.java:

http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/test/java/org/apache/impala/util/TestTopNCache.java@39
PS3, Line 39: TopNCache<Long, Long> cache =
            :             new TopNCache<Long, Long>(new Function<Long, Long>()
{
            :                 @Override
            :                 public Long apply(Long element) { return element; }
            :             }, capacity, policy);
            :         // populate with more values that the specified max capacity
            :         for (long i = 0; i < capacity * 2; i++) cache.putOrUpdate(i);
            :         assertEquals(cache.listEntries().size(), capacity);
I think the TopNCache creation with a specified evictionpolicy and insert types (random/sequential)
can be factored out into a helper.


http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/test/java/org/apache/impala/util/TestTopNCache.java@52
PS3, Line 52: 
Also assert the ranking by randomizing the puts?


http://gerrit.cloudera.org:8080/#/c/8529/3/www/catalog.tmpl
File www/catalog.tmpl:

http://gerrit.cloudera.org:8080/#/c/8529/3/www/catalog.tmpl@135
PS3, Line 135:           <td><a href="table_metrics?name={{fqtn}}">{{name}}-metrics</a></td>
add some unit tests for this? (test_web_pages.py)



-- 
To view, visit http://gerrit.cloudera.org:8080/8529
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37d407979e6d3b1a444b6b6265900b148facde9e
Gerrit-Change-Number: 8529
Gerrit-PatchSet: 3
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <philip@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercegovac@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Dec 2017 22:34:50 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message