Return-Path: X-Original-To: apmail-samza-dev-archive@minotaur.apache.org Delivered-To: apmail-samza-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B0966172A7 for ; Wed, 29 Oct 2014 17:16:22 +0000 (UTC) Received: (qmail 62243 invoked by uid 500); 29 Oct 2014 17:16:22 -0000 Delivered-To: apmail-samza-dev-archive@samza.apache.org Received: (qmail 62191 invoked by uid 500); 29 Oct 2014 17:16:22 -0000 Mailing-List: contact dev-help@samza.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@samza.incubator.apache.org Delivered-To: mailing list dev@samza.incubator.apache.org Received: (qmail 62157 invoked by uid 99); 29 Oct 2014 17:16:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Oct 2014 17:16:21 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of prvs=3727635f4=criccomini@linkedin.com designates 69.28.149.81 as permitted sender) Received: from [69.28.149.81] (HELO esv4-mav05.corp.linkedin.com) (69.28.149.81) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Oct 2014 17:16:17 +0000 X-IronPort-AV: E=Sophos;i="5.07,810,1413270000"; d="scan'208";a="156212897" Received: from esv4-exctest.linkedin.biz (172.18.46.60) by ESV4-HT01.linkedin.biz (172.18.46.235) with Microsoft SMTP Server (TLS) id 14.3.195.1; Wed, 29 Oct 2014 10:12:56 -0700 Received: from ESV4-MB03.linkedin.biz ([fe80::1caa:1422:7ef8:5ceb]) by esv4-exctest.linkedin.biz ([::1]) with mapi id 14.03.0195.001; Wed, 29 Oct 2014 10:12:55 -0700 From: Chris Riccomini To: "dev@samza.incubator.apache.org" Subject: Re: Samza threads issues Thread-Topic: Samza threads issues Thread-Index: AQHP8sS1WTnAHBBYNUSenMSiDer/SJxFqPEAgAFyp4CAADX+AA== Date: Wed, 29 Oct 2014 17:12:55 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.4.5.141003 x-originating-ip: [172.18.46.250] Content-Type: text/plain; charset="us-ascii" Content-ID: <6351095B7F1FD64F8D451B03F9295D9D@linkedin.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hey Dotan, > should we increase the kafka topic sizes to accommodate incoming data >during these time gaps as opposed to the parallel GC? You'll have to experiment and see. I doubt it, though. > Or on a broader aspect - What are the best practices to measure and set >the right size for the kafka topics? Can anyone share his experience on >that? There's a lot that goes into this. Some to consider: 1. Peak bytes/sec throughput. 2. Retention policy for the topic. 3. Parallelism requirements for consumers. At LinkedIn, we start with a default of 8, and size up as needed. The need could be that partitions are running too hot (either on reads or writes), that the partitions are too large on disk (retention policy), or that the downstream consumers can't keep up because their processing is slower than the messages/sec on the partition. Cheers, Chris On 10/28/14 11:59 PM, "Dotan Patrich" wrote: >Thanks Chris, >We will test our product using SerialGC to see how it behave. > >One concern that I have is regarding the kafka topic sizes - Assuming >"stop-the-world" GC stops will more noticable using SerialGC should we >increase the kafka topic sizes to accommodate incoming data during these >time gaps as opposed to the parallel GC? >Or on a broader aspect - What are the best practices to measure and set >the >right size for the kafka topics? Can anyone share his experience on that? > >Thanks, >Dotan > >On Tue, Oct 28, 2014 at 5:53 PM, Chris Riccomini < >criccomini@linkedin.com.invalid> wrote: > >> Hey Dotan, >> >> We run all of our jobs using SerialGC by default. For a few of our >> higher-throughput jobs, we've had better luck with parallel GC or G1, >>but >> in general, serial works fine. >> >> Cheers, >> Chris >> >> On 10/28/14 8:34 AM, "Dotan Patrich" wrote: >> >> >Hi All, >> > >> >I encountered some issues caused by having too many threads for a user >>on >> >linux CentOS. Investigating this deeper, it turned out that the JVM >>spawn >> >over 31 threads per process for GC. Having about 18 Samza processes >> >running >> >on the machine we soon got near to the 1000 threads limit per user. >> >I was thinking of running the Samza JVM with SerialGC instead of >>parallel >> >GC to avoid having so many threads in the environment. In addition, >> >theoretically this seems to be better fitted for situations where we >> >prefer >> >throughput over latency in a single-core environments (this is roughly >> >what >> >we Samza tasks is assigned with). >> > >> >Before doing so, I would really appreciate you insights - did anyone >> >encountered this issue before? Does changing the GC to be serial is a >>good >> >solution? >> > >> >Thanks, >> >Dotan >> >>