From user-return-11750-apmail-storm-user-archive=storm.apache.org@storm.apache.org Wed Sep 7 16:11:59 2016 Return-Path: X-Original-To: apmail-storm-user-archive@minotaur.apache.org Delivered-To: apmail-storm-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B03E19205 for ; Wed, 7 Sep 2016 16:11:59 +0000 (UTC) Received: (qmail 97685 invoked by uid 500); 7 Sep 2016 16:11:58 -0000 Delivered-To: apmail-storm-user-archive@storm.apache.org Received: (qmail 97636 invoked by uid 500); 7 Sep 2016 16:11:58 -0000 Mailing-List: contact user-help@storm.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@storm.apache.org Delivered-To: mailing list user@storm.apache.org Received: (qmail 97626 invoked by uid 99); 7 Sep 2016 16:11:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Sep 2016 16:11:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id E50321A0079 for ; Wed, 7 Sep 2016 16:11:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 0Cnnup8XLWSE for ; Wed, 7 Sep 2016 16:11:56 +0000 (UTC) Received: from mail-oi0-f41.google.com (mail-oi0-f41.google.com [209.85.218.41]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C6F435F1F3 for ; Wed, 7 Sep 2016 16:11:55 +0000 (UTC) Received: by mail-oi0-f41.google.com with SMTP id w78so32595064oie.3 for ; Wed, 07 Sep 2016 09:11:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=0ZSZa0n2nGWVAz9Twc5+mbx22egLaOc+RNQZ95oIdVY=; b=nRzYWZQ0D+cf8viTXRRFd3R+jraM3IoGcsrUg5t6nGLC1mnmRVgBllugsOjoyGKrjJ cnRWVTmzsZm3GP0d2o3026QuMkvqXJWWlk+p/c3TzoOBMeMMWTM1n1HzE5IDU2wSgT27 RL+gFnNJ0MOThSYQt7bPUvpBHXnlcFXjQajQVa6Ihn1+x4glkhFiB0CvaNKgjo3Zz535 sv0s80SuZGHysRO+oV32J7iV8HNP/g4vYuaivwMFKBUtv3DdJr6nh0OrXSpbRNskbx0b gya9/OWuaKGWO6P0jIXqgR2dBr25hdaqwWd8KxDxzL9svTEqo9low8OP6l4qatmkw6HK wHuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=0ZSZa0n2nGWVAz9Twc5+mbx22egLaOc+RNQZ95oIdVY=; b=Eib+ZCepdjDs17Re3EY8mkspGLR0PeFrUoR4CGZG9qcbW72JGIVIpRObcTf3yx03nB CDo9gevv8Xp2xZ1EkMemNwINqfwZ4B5jscnRJpQWTFODcXHcpmoaAjDi0wOvJYxrjE5t dkU64yGEVE5M+aL48GULZkti2bgAQSBdmQh6L+4kwa3yyu8+6eppzla0NYMUSLbJBDsT CnTqUwIHn9pHYkMl2Gcc2M2cFZGl1INcbS4TbxfyOr03tPB0RR6BeHqkkZquaaWAbOYF DonrxlgGCL0knOQ1fhQ/X8G1902z3G1z2Rq5VDOGp/1jK379MqSvYn5HDgJx9l0T7qZt Cdew== X-Gm-Message-State: AE9vXwM8QipGnytqJEXTd0TFxPO9t8RnoE2kcREE6FPPq28AQ0I/AO79UtHiutWkRkx3/q57HCnATuWA2WRnyg== X-Received: by 10.202.237.208 with SMTP id l199mr14855481oih.42.1473264714443; Wed, 07 Sep 2016 09:11:54 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?Aaron_Niskod=C3=A9=2DDossett?= Date: Wed, 07 Sep 2016 16:11:43 +0000 Message-ID: Subject: Re: Increasing worker parallelism decreases throughput and increases tuple timeout To: user@storm.apache.org Content-Type: multipart/alternative; boundary=001a113d3758f6fc76053bed2bf3 --001a113d3758f6fc76053bed2bf3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Let us know what you find, especially if the serializer needs to be more defensive to ensure proper caching. On Tue, Sep 6, 2016 at 8:45 AM Kristopher Kane wrote= : > Come to think of it, I did see RestUtils rank some what higher in the > visualvm CPU profiler but did not give it the attention it deserved. > > On Tue, Sep 6, 2016 at 9:39 AM, Aaron Niskod=C3=A9-Dossett > wrote: > >> Hi Kris, >> >> One possibility is that the Serializer isn't actually caching the schema >> <-> id mappings and is hitting the schema registry every time. The call= to >> register() in getFingerprint() [1] in particular can be a finicky since = the >> cache is ultimately in an IDENTITY hash map, not a regular old hashmap[2= ]. >> I'm familiar with the Avro deserializer you're using and though it >> accounted for this, but perhaps not. >> >> You could add timing information to the getFingerprint() and getSchema() >> calls in ConfluentAvroSerializer. If the results indicate cache misses, >> that's probably your culprit. >> >> Best, Aaron >> >> >> [1] >> https://github.com/apache/storm/blob/master/external/storm-hdfs/src/main= /java/org/apache/storm/hdfs/avro/ConfluentAvroSerializer.java#L66 >> [2] >> https://github.com/confluentinc/schema-registry/blob/v1.0/client/src/mai= n/java/io/confluent/kafka/schemaregistry/client/CachedSchemaRegistryClient.= java#L79 >> >> On Tue, Sep 6, 2016 at 7:40 AM Kristopher Kane >> wrote: >> >>> Hi everyone. >>> >>> I have a simple topology that uses the Avro serializer ( >>> https://github.com/apache/storm/blob/master/external/storm-hdfs/src/mai= n/java/org/apache/storm/hdfs/avro/ConfluentAvroSerializer.java) >>> and writes to Elasticsearch. >>> >>> The topology is like this: >>> >>> Kafka (raw scheme) -> Avro deserializer -> Elasticsearch >>> >>> This topology runs well with one worker, however, once I add one more >>> worker (total of two) and change nothing else, the topology throughput >>> drops and tuples start timing out. >>> >>> I've attached visualvm/jstatd to the workers when in multi worker mode = - >>> and added some jmx configs to the worker opts - but I am unable to see >>> anything glaring. >>> >>> I've never seen Storm act this way but have also never worked with a >>> custom serializer so assume that it is the culprit but I cannot explain >>> why. >>> >>> Any pointers would be appreciated. >>> >>> Kris >>> >>> >>> >>> >>> > --001a113d3758f6fc76053bed2bf3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Let us know what you find, especially i= f the serializer needs to be more defensive to ensure proper caching.
=
On Tue, Sep 6, 2016 at 8:45= AM Kristopher Kane <kkane.list@= gmail.com> wrote:
Come to think of it, I did see RestUtils rank some what higher in = the visualvm CPU profiler but did not give it the attention it deserved.=C2= =A0

On Tue, = Sep 6, 2016 at 9:39 AM, Aaron Niskod=C3=A9-Dossett <dossett@gmail.com&= gt; wrote:
Hi Kri= s,

One possibility is that the Serializer isn't actu= ally caching the schema <-> id mappings and is hitting the schema reg= istry every time.=C2=A0 The call to register() in getFingerprint() [1] in p= articular can be a finicky since the cache is ultimately in an IDENTITY has= h map, not a regular old hashmap[2].=C2=A0 I'm familiar with the Avro d= eserializer you're using and though it accounted for this, but perhaps = not.

You could add timing information to the getFi= ngerprint() and getSchema() calls in ConfluentAvroSerializer.=C2=A0 If the = results indicate cache misses, that's probably your culprit.
=
Best, Aaron



On Tue= , Sep 6, 2016 at 7:40 AM Kristopher Kane <kkane.list@gmail.com> wrote:
Hi everyone. =C2=A0

<= /div>
I have a simple topology that uses the Avro serializer (https://github.com/apache/storm/blob/master/external/storm-hdfs/src/= main/java/org/apache/storm/hdfs/avro/ConfluentAvroSerializer.java)=C2= =A0 and writes to Elasticsearch.=C2=A0

The topolog= y is like this:

Kafka (raw scheme) -> Avro dese= rializer -> Elasticsearch

This topology runs we= ll with one worker, however, once I add one more worker (total of two) and = change nothing else, the topology throughput drops and tuples start timing = out. =C2=A0

I've attached visualvm/jstatd to t= he workers when in multi worker mode - and added some jmx configs to the wo= rker opts - but I am unable to see anything glaring.

I've never seen Storm act this way but have also never worked with a= custom serializer so assume that it is the culprit but I cannot explain wh= y.=C2=A0

Any pointers would be appreciated.=C2=A0<= /div>

Kris



<= /div>


--001a113d3758f6fc76053bed2bf3--