hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-21912) Implement BlacklistingLlapMetricsListener
Date Thu, 04 Jul 2019 09:01:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21912?focusedWorklogId=272067&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-272067
]

ASF GitHub Bot logged work on HIVE-21912:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Jul/19 09:00
            Start Date: 04/Jul/19 09:00
    Worklog Time Spent: 10m 
      Work Description: pvary commented on pull request #698: HIVE-21912: Implement DisablingDaemonStatisticsHandler
URL: https://github.com/apache/hive/pull/698#discussion_r300300551
 
 

 ##########
 File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/metrics/LlapMetricsCollector.java
 ##########
 @@ -210,53 +206,30 @@ public LlapMetrics getMetrics(String workerIdentity) {
     return Collections.unmodifiableMap(instanceStatisticsMap);
   }
 
-  /**
-   * Creates a LlapManagementProtocolClientImpl from a given LlapServiceInstance.
-   */
-  public static class LlapManagementProtocolClientImplFactory {
-    private final Configuration conf;
-    private final RetryPolicy retryPolicy;
-    private final SocketFactory socketFactory;
-
-    public LlapManagementProtocolClientImplFactory(Configuration conf, RetryPolicy retryPolicy,
-                                                   SocketFactory socketFactory) {
-      this.conf = conf;
-      this.retryPolicy = retryPolicy;
-      this.socketFactory = socketFactory;
-    }
-
-    private static LlapManagementProtocolClientImplFactory basicInstance(Configuration conf)
{
-      return new LlapManagementProtocolClientImplFactory(
-              conf,
-              RetryPolicies.retryUpToMaximumCountWithFixedSleep(5, 3000L, TimeUnit.MILLISECONDS),
-              NetUtils.getDefaultSocketFactory(conf));
-    }
-
-    public LlapManagementProtocolClientImpl create(LlapServiceInstance serviceInstance) {
-      LlapManagementProtocolClientImpl client = new LlapManagementProtocolClientImpl(conf,
serviceInstance.getHost(),
-              serviceInstance.getManagementPort(), retryPolicy,
-              socketFactory);
-      return client;
-    }
-  }
-
   /**
    * Stores the metrics retrieved from the llap daemons, along with the retrieval timestamp.
    */
   public static class LlapMetrics {
     private final long timestamp;
-    private final LlapDaemonProtocolProtos.GetDaemonMetricsResponseProto metrics;
+    private final Map<String, Long> metrics;
+
+    @VisibleForTesting
+    LlapMetrics(long timestamp, Map<String, Long> metrics) {
+      this.timestamp = timestamp;
+      this.metrics = metrics;
+    }
 
     public LlapMetrics(LlapDaemonProtocolProtos.GetDaemonMetricsResponseProto metrics) {
       this.timestamp = System.currentTimeMillis();
-      this.metrics = metrics;
+      this.metrics = new HashMap<String, Long>(metrics.getMetricsCount());
+      metrics.getMetricsList().forEach(entry -> this.metrics.put(entry.getKey(), entry.getValue()));
     }
 
     public long getTimestamp() {
       return timestamp;
     }
 
-    public LlapDaemonProtocolProtos.GetDaemonMetricsResponseProto getMetrics() {
+    public Map<String, Long> getMetrics() {
 
 Review comment:
   Done
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 272067)
    Time Spent: 3h 50m  (was: 3h 40m)

> Implement BlacklistingLlapMetricsListener
> -----------------------------------------
>
>                 Key: HIVE-21912
>                 URL: https://issues.apache.org/jira/browse/HIVE-21912
>             Project: Hive
>          Issue Type: Sub-task
>          Components: llap, Tez
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-21912.patch, HIVE-21912.wip-2.patch, HIVE-21912.wip.patch
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> We should implement a DaemonStatisticsHandler which:
>  * If a node average response time is bigger than 150% (configurable) of the other nodes
>  * If the other nodes has enough empty executors to handle the requests
> Then disables the limping node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message