Return-Path: X-Original-To: apmail-samza-dev-archive@minotaur.apache.org Delivered-To: apmail-samza-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9005718B36 for ; Wed, 17 Jun 2015 06:26:30 +0000 (UTC) Received: (qmail 86600 invoked by uid 500); 17 Jun 2015 06:26:25 -0000 Delivered-To: apmail-samza-dev-archive@samza.apache.org Received: (qmail 86542 invoked by uid 500); 17 Jun 2015 06:26:25 -0000 Mailing-List: contact dev-help@samza.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@samza.apache.org Delivered-To: mailing list dev@samza.apache.org Received: (qmail 86531 invoked by uid 99); 17 Jun 2015 06:26:25 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jun 2015 06:26:25 +0000 Received: from mail-ig0-f177.google.com (mail-ig0-f177.google.com [209.85.213.177]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 2BEC81A081F for ; Wed, 17 Jun 2015 06:26:25 +0000 (UTC) Received: by igbos3 with SMTP id os3so61167138igb.0 for ; Tue, 16 Jun 2015 23:26:24 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.50.109.138 with SMTP id hs10mr8461626igb.48.1434522384584; Tue, 16 Jun 2015 23:26:24 -0700 (PDT) Received: by 10.64.116.199 with HTTP; Tue, 16 Jun 2015 23:26:24 -0700 (PDT) In-Reply-To: References: Date: Tue, 16 Jun 2015 23:26:24 -0700 Message-ID: Subject: Re: Measuring Samza Job Throughput From: Chris Riccomini To: Chris Riccomini Cc: "dev@samza.apache.org" Content-Type: multipart/alternative; boundary=089e0122e6aa27ffb70518b0c5e1 --089e0122e6aa27ffb70518b0c5e1 Content-Type: text/plain; charset=UTF-8 Hmm, correction. I think this has to be done at tbhe KafkaSystem level. We allow consumers and producers to return non-byte messages, which means nothing in container can safely assume that a message is a byte array except the serde manager. I took a look there but didn't see any byte throughout metrics after all. On Tuesday, June 16, 2015, Chris Riccomini wrote: > Hey Milinda, > > Specifically, for bytes/sec, you might want to look at serde metrics. I > believe the serde manager tracks bytes serialized and deserialized per > second. The consumers and producers also do this for Kafka, but on a more > granular basis. If you want container-level throughput, serde manager is > worth looking at. > > Cheers, > Chris > > On Tuesday, June 16, 2015, Milinda Pathirage > wrote: > >> Hi Devs, >> >> I was looking for a way to measure Samza job throughput and found that its >> possible to do it via Samza's metrics reporter. But there several types of >> metrics reported via this method. For example, TaskInstanceMetrics reports >> number of messages sent. But if I wanted to get a measurement like bytes >> per second produced, is there a way to do that. It looks >> like KafkaSystemProducerMetrics and TaskInstanceMetrics only provide >> number >> of messages sent. >> >> If any of you have any experience in measuring Samza job throughput, can >> you please share. Really appreciate any ideas on measuring job throughput. >> >> Thanks >> Milinda >> -- >> Milinda Pathirage >> >> PhD Student | Research Assistant >> School of Informatics and Computing | Data to Insight Center >> Indiana University >> >> twitter: milindalakmal >> skype: milinda.pathirage >> blog: http://milinda.pathirage.org >> > --089e0122e6aa27ffb70518b0c5e1--