Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 492D6200D29 for ; Thu, 26 Oct 2017 19:38:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 47DA8160BF4; Thu, 26 Oct 2017 17:38:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8F0EF1609E5 for ; Thu, 26 Oct 2017 19:38:04 +0200 (CEST) Received: (qmail 74840 invoked by uid 500); 26 Oct 2017 17:38:03 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 74828 invoked by uid 99); 26 Oct 2017 17:38:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Oct 2017 17:38:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id BE5821A07D9 for ; Thu, 26 Oct 2017 17:38:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id untONSWRORdM for ; Thu, 26 Oct 2017 17:38:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id B7A205FD8B for ; Thu, 26 Oct 2017 17:38:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 2D25EE0E4A for ; Thu, 26 Oct 2017 17:38:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B669F212F7 for ; Thu, 26 Oct 2017 17:38:00 +0000 (UTC) Date: Thu, 26 Oct 2017 17:38:00 +0000 (UTC) From: "Aaron Fabbri (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-14973) [s3a] Log StorageStatistics MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 26 Oct 2017 17:38:05 -0000 [ https://issues.apache.org/jira/browse/HADOOP-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220862#comment-16220862 ] Aaron Fabbri commented on HADOOP-14973: --------------------------------------- I've seen the toString() stuff, but it doesn't give us a way of logging periodic stats without requiring customers to change job code, right? I'd like a configurable way to get periodic statistics logged while we work on getting this stuff plumbed through the major compute engines. It would be nice to have something to looks at when someone wants to know why their job is going slow, without requiring job changes. Too much spam in logs is a concern though. [~stevel@apache.org] what do you think about having a config knob like {{fs.s3a.statistics.log.seconds}} or {{fs.s3a.statistics.log.on.close}}? Note also that aggregation by sum no longer works for many of these metrics (obvious but worth mentioning). > [s3a] Log StorageStatistics > --------------------------- > > Key: HADOOP-14973 > URL: https://issues.apache.org/jira/browse/HADOOP-14973 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.0.0-beta1, 2.8.1 > Reporter: Sean Mackrory > Assignee: Sean Mackrory > > S3A is currently storing much more detailed metrics via StorageStatistics than are logged in a MapReduce job. Eventually, it would be nice to get Spark, MapReduce and other workloads to retrieve and store these metrics, but it may be some time before they all do that. I'd like to consider having S3A publish the metrics itself in some form. This is tricky, as S3A has no daemon but lives inside various other processes. > Perhaps writing to a log file at some configurable interval and on close() would be the best we could do. Other ideas would be welcome. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org