From dev-return-50603-archive-asf-public=cust-asf.ponee.io@mesos.apache.org Fri Jul 6 19:25:32 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 5717F180627 for ; Fri, 6 Jul 2018 19:25:31 +0200 (CEST) Received: (qmail 40506 invoked by uid 500); 6 Jul 2018 17:25:30 -0000 Mailing-List: contact dev-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list dev@mesos.apache.org Received: (qmail 40491 invoked by uid 99); 6 Jul 2018 17:25:29 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jul 2018 17:25:29 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 28005C7175 for ; Fri, 6 Jul 2018 17:25:29 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.898 X-Spam-Level: * X-Spam-Status: No, score=1.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=mesosphere.io Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id g3lvxXwmHR9L for ; Fri, 6 Jul 2018 17:25:27 +0000 (UTC) Received: from mail-io0-f198.google.com (mail-io0-f198.google.com [209.85.223.198]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B15925F2C5 for ; Fri, 6 Jul 2018 17:25:27 +0000 (UTC) Received: by mail-io0-f198.google.com with SMTP id v2-v6so3830947ioh.17 for ; Fri, 06 Jul 2018 10:25:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mesosphere.io; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Uq/uBhp8uwOvVmmhbaKCojt9RGFCTbGhEETg88vPF/M=; b=k+HkC1xr1Qr/HVmRrFNNejbPZfCA2Kk//xft2kfVSvYsfVhcagD/C+s3pPjT3wwvg6 GgfN5/A+5lEaB3gIr52oHt0V35cs019kz+AKm+oG+0vZesTAQnRYRG7P6kG8Sn8/DCqt BzdySlAQedEriJG9BHGzH80L8RsJlkexoTYMc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Uq/uBhp8uwOvVmmhbaKCojt9RGFCTbGhEETg88vPF/M=; b=hlGHU2GEzzAaskHzbqODZFr4ZXFvvyx9iQFIBRDmBJFq+2kunlk2tYX9Hz+/Br3cVC 5EV7L3GEYbPxG/p5PiWiIV7ACbtl/ddcRyuzYKF2DQ2ojzefpLnB70HWCPucSp8I0PiM KmEnAV9F2PmBi1nDpTBRuE5LgKwdb2MC+vxbSbXMw0vOGJr6uwkApTudp5i6dI4IM0cw YAnbmwDM2kgvxUsC44eFQEEbhhPm+9PcMnY8OI6dktmufvAxQm1zEuV+12bDRKrhNQk3 kdelWGuiVQIT25DRMNpP9PCBb2bMqgheC+jJqt7aUUZy+jOFpgDHCk3IrERGa5MM4u2c u9KQ== X-Gm-Message-State: APt69E0xnrLNS3PW5M8LmNceNlyQdClMqYxc34kg6QOY38SpyXS8jXGF bFUT+ZC1uTLYwD6n4YuegSU5OKUgW1PZifwff5hd7w== X-Google-Smtp-Source: AAOMgpdor2cpzD+mOpnSUlgUVjwCFNcjkTW06RyK8Qg0rMg5JI8ImyLMKrpZoAdjhU7709W8ZFAlzVaL3frtp5M7dyw= X-Received: by 2002:a24:54d:: with SMTP id 74-v6mr8271986itl.96.1530897926974; Fri, 06 Jul 2018 10:25:26 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Greg Mann Date: Fri, 6 Jul 2018 10:25:15 -0700 Message-ID: Subject: Re: Normalization of metric keys To: user Cc: dev Content-Type: multipart/alternative; boundary="000000000000200c4a057057f3fc" --000000000000200c4a057057f3fc Content-Type: text/plain; charset="UTF-8" Thanks for the reply Ben! Yea I suspect the lack of normalization there was not intentional, and it means that you can no longer reliably split on '/' unless you apply some external controls to user input. Yep, this is bad :) One thing we should consider when normalizing metadata embedded in metric keys (like framework name/ID) is that operators will likely want to de-normalize this information in their metrics tooling. For example, ideally something like the 'mesos_exporter' [1] could expose the framework name/ID as tags which could be easily consumed by the cluster's metrics infrastructure. To accommodate de-normalization, any substitutions we perform while normalizing should be: 1. Unique - we should substitute a single, unique string for each disallowed character 2. Verbose - we should substitute strings which are unlikely to appear in user input. (Examples: '#&#', '*@*', '%!%', etc.) My thought was that since operators can already not simply "split on slash", we might as well avoid normalization to avoid the above difficulties with de-normalization. There are other ways that the metric keys can be processed (e.g., regex), but this requires that the tooling have prior knowledge of the keys it should expect, which is not ideal. However, it also seems reasonable to me to normalize in a way that accommodates de-normalization as noted above. *What would folks think about normalizing by substituting '/' with '#%$'?* Another character that came up as a possible candidate for substitution is the space character, which could very well appear in framework names. I don't think we have a good reason on our side to substitute whitespace, but perhaps its presence in the metric keys would cause issues with external tooling? Greg [1] https://github.com/mesosphere/mesos_exporter On Tue, Jul 3, 2018 at 5:56 PM, Benjamin Mahler wrote: > I don't think the lack of principal normalization was intentional. Why > spread that further? Don't we also have some normalization today? > > Having slashes show up in components complicates parsing (can no longer > split on '/'), no? For example, if we were to introduce the ability to > query a subset of metrics with a simple matcher (e.g. > /frameworks/*/messages_received), then this would be complicated by the > presence of slashes in the principal or other user supplied strings. > > On Tue, Jul 3, 2018 at 3:17 PM, Greg Mann wrote: > >> Hi all! >> I'm currently working on adding a suite of new per-framework metrics to >> help schedulers better debug unexpected/unwanted behavior (MESOS-8842 >> ). One issue that has >> come up during this work is how we should handle strings like the framework >> name or role name in metric keys, since those strings may contain >> characters like '/' which already have a meaning in our metrics interface. >> I intend to place the framework name and ID in the keys for the new >> per-framework metrics, delimited by a sufficiently-unique separator so that >> operators can decode the name/ID in their metrics tooling. An example >> per-framework metric key: >> >> master/frameworks/###/tasks/task_running >> >> >> I recently realized that we actually already allow the '/' character in >> metric keys, since we include the framework principal in these keys: >> >> frameworks//messages_received >> frameworks//messages_processed >> >> We don't disallow any characters in the principal, so anything could >> appear in those keys. >> >> *Since we don't normalize the principal in the above keys, my proposal is >> that we do not normalize the framework name at all when constructing the >> new per-framework metric keys.* >> >> >> Let me know what you think! >> >> Cheers, >> Greg >> > > --000000000000200c4a057057f3fc--