Return-Path: X-Original-To: apmail-samza-dev-archive@minotaur.apache.org Delivered-To: apmail-samza-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EAB8317B9F for ; Tue, 14 Oct 2014 22:23:56 +0000 (UTC) Received: (qmail 87433 invoked by uid 500); 14 Oct 2014 22:23:56 -0000 Delivered-To: apmail-samza-dev-archive@samza.apache.org Received: (qmail 87385 invoked by uid 500); 14 Oct 2014 22:23:56 -0000 Mailing-List: contact dev-help@samza.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@samza.incubator.apache.org Delivered-To: mailing list dev@samza.incubator.apache.org Received: (qmail 87368 invoked by uid 99); 14 Oct 2014 22:23:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Oct 2014 22:23:56 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of prvs=357e11146=criccomini@linkedin.com designates 69.28.149.80 as permitted sender) Received: from [69.28.149.80] (HELO esv4-mav04.corp.linkedin.com) (69.28.149.80) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Oct 2014 22:23:30 +0000 X-IronPort-AV: E=Sophos;i="5.04,719,1406617200"; d="scan'208";a="150734212" Received: from ESV4-MB03.linkedin.biz ([fe80::1caa:1422:7ef8:5ceb]) by ESV4-HT02.linkedin.biz ([::1]) with mapi id 14.03.0195.001; Tue, 14 Oct 2014 15:23:27 -0700 From: Chris Riccomini To: "dev@samza.incubator.apache.org" Subject: Re: Questions on topic creation Thread-Topic: Questions on topic creation Thread-Index: AQHP5+our6o/9A3ekU2aWKKKEod3uJwwKxqA Date: Tue, 14 Oct 2014 22:23:26 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.4.4.140807 x-originating-ip: [172.18.46.250] Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hey Roger, > Do I need to manually create the KV store changelog topic? Yes, unfortunately you do need to create it manually at the moment. > I saw this ticket (https://issues.apache.org/jira/browse/SAMZA-226) but >it looks like it's still open. Yep, that's the ticket to fix the above issue. :) It is indeed still open. > Do checkpoint topics get created? Yes. > Are jobs tasks assigned to partitions of a shared checkpoint topic or do >they each get their own checkpoint topic? In 0.7.0, each task got its own partition. In 0.8.0 (post-SAMZA-123), the checkpoint topic is single partition, and all tasks in one job share this partition. Note that jobs each still have their own checkpoint topics. The SAMZA-123 JIRA has a design doc that Jakob wrote, which describes how the checkpoint topic works. For 0.7.0, the legacy checkpoint topics, you can find docs here: =20 http://samza.incubator.apache.org/learn/documentation/0.7.0/container/check pointing.html > Should I proceed with this version or would it make life easier to use >trunk or something closer to 0.8.0? I would recommend using 0.8.0 (master). We've not yet released it, but that's mostly since we're waiting on SAMZA-236. We've been running 0.8.0 at LinkedIn for several large jobs (600k-800k msgs/sec), and it's been pretty solid. It also has a ton of performance improvements, an new UI, etc. > Anything else I need to watch out for? If you're already running with 0.7.0, you'll either need to abandon your checkpoints, or wait for SAMZA-354. The 0.8.0 checkpoint topic changes were backwards incompatible, and thus we are adding an auto-migration feature, which hasn't yet been written (though it's being worked on right now). Cheers, Chris On 10/14/14 1:04 PM, "Roger Hoover" wrote: >Hi all, > >I want to deploy a Samza job in a pre-production environment and need to >figure out how to handle configuration of the various topics. In >particular, I want to make sure topics like the KV store changelog are >configured to be compacted so that data isn't lost over time. > >Do I need to manually create the KV store changelog topic? I saw this >ticket (https://issues.apache.org/jira/browse/SAMZA-226) but it looks like >it's still open. > >Do checkpoint topics get created? If not, what does the >"task.checkpoint.replication.factor" configuration do? Are jobs tasks >assigned to partitions of a shared checkpoint topic or do they each get >their own checkpoint topic? > >So far I've developed my proof of concept job with 0.7.0. Should I >proceed >with this version or would it make life easier to use trunk or something >closer to 0.8.0? > >Anything else I need to watch out for? > >Thanks, > >Roger