hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-21823) New metrics to get the average queue length / free executor number for a given time window
Date Tue, 04 Jun 2019 19:22:01 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21823?focusedWorklogId=253966&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253966
]

ASF GitHub Bot logged work on HIVE-21823:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Jun/19 19:21
            Start Date: 04/Jun/19 19:21
    Worklog Time Spent: 10m 
      Work Description: odraese commented on pull request #660: HIVE-21823: New metrics to
get the average queue length / free executor number for a given time window
URL: https://github.com/apache/hive/pull/660#discussion_r290451548
 
 

 ##########
 File path: llap-server/src/java/org/apache/hadoop/hive/llap/metrics/LlapDaemonExecutorMetrics.java
 ##########
 @@ -397,4 +425,106 @@ public JvmMetrics getJvmMetrics() {
   public String getName() {
     return name;
   }
+
+  /**
+   * Generate time aware average for data points.
+   * For example if we have 3s when the queue size is 1, and 1s when the queue size is 2
then the
+   * calculated average should be (3*1+1*2)/4 = 1.25.
+   */
+  @VisibleForTesting
+  static class TimedAverageMetrics {
+    private final int windowDataSize;
+    private final long windowTimeSize;
+    private final Data[] data;
+    private int nextPos = 0;
+
+    /**
+     * Creates and initializes the metrics object.
+     * @param windowDataSize The maximum number of samples stored
+     * @param windowTimeSize The time window used to generate the average in nanoseconds
+     */
+    TimedAverageMetrics(int windowDataSize, long windowTimeSize) {
+      this(windowDataSize, windowTimeSize, System.nanoTime());
 
 Review comment:
   Using the current time as default time essentially makes all values valid within the time
window. We therefore aggregate just zeros here. Wouldn't it make sense to initialize this
with zero to mark all entries "outside of the window" initially?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 253966)
    Time Spent: 0.5h  (was: 20m)

> New metrics to get the average queue length / free executor number for a given time window
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21823
>                 URL: https://issues.apache.org/jira/browse/HIVE-21823
>             Project: Hive
>          Issue Type: Sub-task
>          Components: llap
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-21823.patch
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We need to calculate the average queue size / free executor size for a window to have
good data for making routing decisions.
> Interesting things to consider:
>  * The timeĀ gap between arriving request can be different, so simple average is not
enough to have correct data
>  * We need to have 2 parameters
>  ** Time window length
>  ** Maximum data point numbers - so we will not collect "infinite" amount of data at
high load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message