Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 88968200B11 for ; Mon, 30 May 2016 06:32:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 872CF160A2C; Mon, 30 May 2016 04:32:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D0835160A07 for ; Mon, 30 May 2016 06:32:13 +0200 (CEST) Received: (qmail 30841 invoked by uid 500); 30 May 2016 04:32:13 -0000 Mailing-List: contact issues-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list issues@ambari.apache.org Received: (qmail 30831 invoked by uid 99); 30 May 2016 04:32:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 May 2016 04:32:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C59AB2C14F8 for ; Mon, 30 May 2016 04:32:12 +0000 (UTC) Date: Mon, 30 May 2016 04:32:12 +0000 (UTC) From: "Jungtaek Lim (JIRA)" To: issues@ambari.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (AMBARI-16946) Storm Metrics Sink has high chance to discard some datapoints MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 30 May 2016 04:32:14 -0000 [ https://issues.apache.org/jira/browse/AMBARI-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated AMBARI-16946: ---------------------------------- Attachment: AMBARI-16946.patch Reattaching patch since it's not the same format other issues have. > Storm Metrics Sink has high chance to discard some datapoints > ------------------------------------------------------------- > > Key: AMBARI-16946 > URL: https://issues.apache.org/jira/browse/AMBARI-16946 > Project: Ambari > Issue Type: Bug > Components: ambari-metrics > Reporter: Jungtaek Lim > Attachments: AMBARI-16946.patch > > > There's a mismatch between TimelineMetricsCache and Storm metrics unit, while TimelineMetricsCache considers "metric name + timestamp" to be unique but Storm is not. > For example, assume that bolt B has task T1, T2 and B has registered metrics M1. It's possible for metrics sink to receive (T1, M1) and (T2, M1) with same timestamp TS1 (in TaskInfo, not current time), and received later will be discarded from TimelineMetricsCache. > If we want to have unique metric point of Storm, we should use "topology name + component name + task id + metric name" to metric name so that "metric name + timestamp" will be unique. > There're other issues I would like to address, too. > - Currently, hostname is written to hostname of the machine which runs metrics sink. Since TaskInfo has hostname of the machine which runs task, we're better to use this. > - Unit of timestamp of TaskInfo is second, while Storm Metrics Sink uses this as millisecond, resulting in timestamp flaw, and malfunction of cache eviction. It should be multiplied by 1000. > - 'component name' is not unique across the cluster, so it's not fit for app id. 'topology name' is unique so proper value of app id is topology name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)