Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6A4CB200B7E for ; Tue, 6 Sep 2016 15:40:20 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 69040160ABF; Tue, 6 Sep 2016 13:40:20 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 90CC1160AAD for ; Tue, 6 Sep 2016 15:40:19 +0200 (CEST) Received: (qmail 46295 invoked by uid 500); 6 Sep 2016 13:40:13 -0000 Mailing-List: contact user-help@storm.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@storm.apache.org Delivered-To: mailing list user@storm.apache.org Received: (qmail 46285 invoked by uid 99); 6 Sep 2016 13:40:13 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2016 13:40:13 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 279EE185842 for ; Tue, 6 Sep 2016 13:40:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.198 X-Spam-Level: * X-Spam-Status: No, score=1.198 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id c9TG2uEj1M9W for ; Tue, 6 Sep 2016 13:40:09 +0000 (UTC) Received: from mail-oi0-f48.google.com (mail-oi0-f48.google.com [209.85.218.48]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 63C345FBE1 for ; Tue, 6 Sep 2016 13:40:08 +0000 (UTC) Received: by mail-oi0-f48.google.com with SMTP id s131so34207097oie.2 for ; Tue, 06 Sep 2016 06:40:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=pc06apuHrH/j1uAiG+ftZOS8tQKlYsqqtgT8UGp1mts=; b=BkD/mPJZxQU0JHW/iqpVCZYDNb6QoVGTfRnMJWWDpmDjfCBX+fb+NyAQkiEIfPrfuj s0KKTt4ZQnIuQy5c0EAmRNv8uWVoCOXyovdR3/FLM7sWw2Cq3x+mlCSveYD5zGp8rzEz VgfDysA0ELZBTMkAYOuTnnhT7oi/NeyQ7WRbLoqU1wXBGrnvLJ54VsTqxW/svZOHu74c kRAUVXnZW+pPm65n6nH8lp/qxKqvv5/l87sN8+WNk/roGSotO4TTWvSBde29v2N3RZXD UNoKJH/4ioavyh1LwkN3qD8WnCpZC+/T7/ZO9+6MeMgAUTthDCyGhFh3MRcUMa7AND6/ ZrFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=pc06apuHrH/j1uAiG+ftZOS8tQKlYsqqtgT8UGp1mts=; b=JdGu6ki7xhIHcDt32jqqaVdhXMFtmomLuo4xBw7UnRtqQLOf8GyqX49TINuufkQv7F 5FL9T/Q6cdmTsg287kj2kmB5IaBj+PeZqNqxT5e3QlklPPCmbr/CIfRK/lKCEdfeKgML 6+x7TNd+UeqzgYh/Mu4PHWX514ZINIIRqV+EpfB+ypRQlttdL885gHU4IpvAWt9o5Ohf bXxSIPBQFZh+fRkocu3cpbD8jjZO2fqziQ4gVqwPXHM58iKKn38ASBZPxWuLgFKxgWaX h1Z/iJ8y4HRiQw0kVUPHbwI/idzGyf4jX8gsLyOOdgw1LsBlFnpxwbZxWIJjwsH6F1X6 zoMg== X-Gm-Message-State: AE9vXwMQDrYy+n3baAXFT7UnptOfhE0MkUnSDNwfZruh5k+XPEMrEGOBzQ/YtSlzYx1RqLo8S645ncXm+gANuQ== X-Received: by 10.202.97.2 with SMTP id v2mr31560114oib.157.1473169207089; Tue, 06 Sep 2016 06:40:07 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?Aaron_Niskod=C3=A9=2DDossett?= Date: Tue, 06 Sep 2016 13:39:56 +0000 Message-ID: Subject: Re: Increasing worker parallelism decreases throughput and increases tuple timeout To: user@storm.apache.org Content-Type: multipart/alternative; boundary=001a113d525a4869c2053bd6ef02 archived-at: Tue, 06 Sep 2016 13:40:20 -0000 --001a113d525a4869c2053bd6ef02 Content-Type: text/plain; charset=UTF-8 Hi Kris, One possibility is that the Serializer isn't actually caching the schema <-> id mappings and is hitting the schema registry every time. The call to register() in getFingerprint() [1] in particular can be a finicky since the cache is ultimately in an IDENTITY hash map, not a regular old hashmap[2]. I'm familiar with the Avro deserializer you're using and though it accounted for this, but perhaps not. You could add timing information to the getFingerprint() and getSchema() calls in ConfluentAvroSerializer. If the results indicate cache misses, that's probably your culprit. Best, Aaron [1] https://github.com/apache/storm/blob/master/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/avro/ConfluentAvroSerializer.java#L66 [2] https://github.com/confluentinc/schema-registry/blob/v1.0/client/src/main/java/io/confluent/kafka/schemaregistry/client/CachedSchemaRegistryClient.java#L79 On Tue, Sep 6, 2016 at 7:40 AM Kristopher Kane wrote: > Hi everyone. > > I have a simple topology that uses the Avro serializer ( > https://github.com/apache/storm/blob/master/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/avro/ConfluentAvroSerializer.java) > and writes to Elasticsearch. > > The topology is like this: > > Kafka (raw scheme) -> Avro deserializer -> Elasticsearch > > This topology runs well with one worker, however, once I add one more > worker (total of two) and change nothing else, the topology throughput > drops and tuples start timing out. > > I've attached visualvm/jstatd to the workers when in multi worker mode - > and added some jmx configs to the worker opts - but I am unable to see > anything glaring. > > I've never seen Storm act this way but have also never worked with a > custom serializer so assume that it is the culprit but I cannot explain > why. > > Any pointers would be appreciated. > > Kris > > > > > --001a113d525a4869c2053bd6ef02 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Kris,

One possibility is that the Se= rializer isn't actually caching the schema <-> id mappings and is= hitting the schema registry every time.=C2=A0 The call to register() in ge= tFingerprint() [1] in particular can be a finicky since the cache is ultima= tely in an IDENTITY hash map, not a regular old hashmap[2].=C2=A0 I'm f= amiliar with the Avro deserializer you're using and though it accounted= for this, but perhaps not.

You could add timing i= nformation to the getFingerprint() and getSchema() calls in ConfluentAvroSe= rializer.=C2=A0 If the results indicate cache misses, that's probably y= our culprit.

Best, Aaron

=
On Tue, Sep 6, 2016 at 7:40 A= M Kristopher Kane <kkane.list@gm= ail.com> wrote:
Hi everyone. =C2=A0

I have a simple topology that u= ses the Avro serializer (https://github.com/apache/storm/blob= /master/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/avro/Conflu= entAvroSerializer.java)=C2=A0 and writes to Elasticsearch.=C2=A0
<= div>
The topology is like this:

Kafk= a (raw scheme) -> Avro deserializer -> Elasticsearch

This topology runs well with one worker, however, once I add one m= ore worker (total of two) and change nothing else, the topology throughput = drops and tuples start timing out. =C2=A0

I've= attached visualvm/jstatd to the workers when in multi worker mode - and ad= ded some jmx configs to the worker opts - but I am unable to see anything g= laring.

I've never seen Storm act this way but= have also never worked with a custom serializer so assume that it is the c= ulprit but I cannot explain why.=C2=A0

Any pointer= s would be appreciated.=C2=A0

Kris

<= /div>



--001a113d525a4869c2053bd6ef02--