Return-Path: X-Original-To: apmail-samza-dev-archive@minotaur.apache.org Delivered-To: apmail-samza-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DD11A1854F for ; Mon, 2 Nov 2015 18:33:54 +0000 (UTC) Received: (qmail 67101 invoked by uid 500); 2 Nov 2015 18:33:54 -0000 Delivered-To: apmail-samza-dev-archive@samza.apache.org Received: (qmail 66961 invoked by uid 500); 2 Nov 2015 18:33:54 -0000 Mailing-List: contact dev-help@samza.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@samza.apache.org Delivered-To: mailing list dev@samza.apache.org Received: (qmail 66935 invoked by uid 99); 2 Nov 2015 18:33:54 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Nov 2015 18:33:54 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id B7BD128E8C7; Mon, 2 Nov 2015 18:33:52 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============7875327696436055764==" MIME-Version: 1.0 Subject: Re: Review Request 39806: SAMZA-798 : Performance and stability issue after combining checkpoint and coordinator stream From: "Xinyu Liu" To: "Jake Maes" , "Chris Riccomini" , "Yi Pan (Data Infrastructure)" , "Jagadish Venkatraman" Cc: "Navina Ramesh" , "Xinyu Liu" , "samza" Date: Mon, 02 Nov 2015 18:33:52 -0000 Message-ID: <20151102183352.16291.63088@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Xinyu Liu" X-ReviewGroup: Samza X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/39806/ X-Sender: "Xinyu Liu" References: <20151030180623.22461.60061@reviews.apache.org> In-Reply-To: <20151030180623.22461.60061@reviews.apache.org> Reply-To: "Xinyu Liu" X-ReviewRequest-Repository: samza --===============7875327696436055764== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Oct. 30, 2015, 6:06 p.m., Xinyu Liu wrote: > > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala, line 132 > > > > > > I think scala prefers to use the companion object as the factory to create new instance (code before change). Is there any reason for this change? > > Navina Ramesh wrote: > I didn't know that about scala. But what is the advantage behind it? It only seems to obscure the functionality of an instance, imo. Let me know why. I don't mind either way. :) My understanding is that using the companion object apply() method is the "functional" way to create new object, instead of new object(see http://stackoverflow.com/questions/9737352/what-is-the-apply-function-in-scala). So it's pretty popular in Scala. It does looks cleaner to me if there are multiple constructors defined or creating subclass objects, plus the scala lib uses this pattern a lot, like Map(), List(), ... - Xinyu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/39806/#review104583 ----------------------------------------------------------- On Nov. 2, 2015, 5:50 a.m., Navina Ramesh wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/39806/ > ----------------------------------------------------------- > > (Updated Nov. 2, 2015, 5:50 a.m.) > > > Review request for samza, Chris Riccomini, Jake Maes, Jagadish Venkatraman, and Yi Pan (Data Infrastructure). > > > Bugs: SAMZA-798 > https://issues.apache.org/jira/browse/SAMZA-798 > > > Repository: samza > > > Description > ------- > > Adding interfaces for CheckpointManager, CheckpointManagerFactory and moving Checkpoint to api > > > Adding KafkaCheckpointLogKey, KafkaCheckpointManager and KafkaCheckpointManagerFactory back from 0.9.1 > > > Changed SamzaContainer and OffsetManager > > > Removed checkpointmanager in JC and modified TaskModel to remove offsetMapping. Container will continue to use offsetmanager for fetching offsets > > > Fixed OffsetManager bugs > > > Got rid of all compile errors during build with -x test > > > Fixing Jackson object mapper for TaskModel > > > Commented tests in checkpoint manager and fixed other failing tests > > > Refactored KCM and moved generic functions like createTopic & validateTopic to kafkaUtil.scala > > > KCM unit tests work > > > Got rid of old migration code and its test. Got rid of redundant KCM > > > Commented out migration related tests in jobrunner > > > Moved migration code from old.checkpoint package > > > Fixed 1 migration test > > > Fixed checkpoint migration and its unit tests > > > Removed migration related tests from TestKafkaCheckpointManager > > > Removed some commented lines and fixed a test in TestJobCoordinator > > > Deleted CheckpointManager and SetCheckpoint > > > Diffs > ----- > > docs/learn/documentation/versioned/jobs/configuration-table.html 4adac09305cbdb07b0d2cd9f87777b189df1c290 > samza-api/src/main/java/org/apache/samza/checkpoint/CheckpointManager.java PRE-CREATION > samza-api/src/main/java/org/apache/samza/checkpoint/CheckpointManagerFactory.java PRE-CREATION > samza-core/src/main/java/org/apache/samza/checkpoint/Checkpoint.java > samza-core/src/main/java/org/apache/samza/checkpoint/CheckpointManager.java 0185751c28979e50b1bddc28c90339defd94200b > samza-core/src/main/java/org/apache/samza/coordinator/stream/messages/SetCheckpoint.java 21afa8569801150e81b4c14ee21a9077dfa1895f > samza-core/src/main/java/org/apache/samza/job/model/TaskModel.java e00c49d5255c0af6d44e251aed4e8360cd3026c5 > samza-core/src/main/java/org/apache/samza/serializers/model/JsonTaskModelMixIn.java 172358a5428c9789e0883fc0e5ad3e5c3398478a > samza-core/src/main/scala/org/apache/samza/checkpoint/CheckpointTool.scala 2e3aeb8fd5a86aa39464adff9e75aca96622ebad > samza-core/src/main/scala/org/apache/samza/checkpoint/OffsetManager.scala 1464acc7ec6592a21c3cdf96f34847e094e9e5e3 > samza-core/src/main/scala/org/apache/samza/checkpoint/file/FileSystemCheckpointManager.scala PRE-CREATION > samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 0b73403018b895879ed2c0538a5cd495813d2eae > samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 03299cb7cb93d43165a74206113497462d8119e9 > samza-core/src/main/scala/org/apache/samza/migration/JobRunnerMigration.scala 374e27e8233a27132019d429f6fa1f131db3fe15 > samza-core/src/test/java/org/apache/samza/coordinator/stream/MockCoordinatorStreamWrappedConsumer.java dd04d28e54e7afe0cc6d6c2aa508911a14e668bf > samza-core/src/test/java/org/apache/samza/serializers/model/TestSamzaObjectMapper.java ad1fbc597802078c1a1b7d8f1dbafbd5adf610ae > samza-core/src/test/scala/org/apache/samza/checkpoint/TestCheckpointTool.scala 00b89773ad00b8f445bb1320121ab8af56870327 > samza-core/src/test/scala/org/apache/samza/checkpoint/TestOffsetManager.scala c00ef91c13b96c8b1845822046343b652a33c1d5 > samza-core/src/test/scala/org/apache/samza/checkpoint/file/TestFileSystemCheckpointManager.scala PRE-CREATION > samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala a77ddc7640a8dbbdee391e65a5b432c477b0b67b > samza-core/src/test/scala/org/apache/samza/container/grouper/task/TestGroupByContainerCount.scala ddf1fdef9265b4dbd0e24abe2bff63a3e1244733 > samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 1393da84f145c81efd59baabc8a7d3d2132aa05f > samza-core/src/test/scala/org/apache/samza/job/local/TestProcessJob.scala a1efe6f2707dc59d2414ebcc0b38f0f95150da64 > samza-kafka/src/main/scala/old/checkpoint/KafkaCheckpointLogKey.scala 958d07ce3e5d69b15ad74ff52f4572822e0bf09f > samza-kafka/src/main/scala/old/checkpoint/KafkaCheckpointManager.scala 627631aa7e3d77349b9e6896fc21737855b0e946 > samza-kafka/src/main/scala/old/checkpoint/KafkaCheckpointManagerFactory.scala 189752a13f2363c632e3781c0e649a4aae65a9b4 > samza-kafka/src/main/scala/old/checkpoint/KafkaCheckpointMigration.scala 32afe4c6832df4de0f54007d3e4ee0ce9be856f7 > samza-kafka/src/main/scala/org/apache/samza/config/KafkaConfig.scala 798033c300a8e816589233a3dc7639ca88841b40 > samza-kafka/src/main/scala/org/apache/samza/migration/KafkaCheckpointMigration.scala PRE-CREATION > samza-kafka/src/main/scala/org/apache/samza/util/KafkaUtil.scala a7a095b4d2f19be5ad6119d5bfc715bffaeb68af > samza-kafka/src/main/scala/org/apache/samza/util/KafkaUtilException.scala PRE-CREATION > samza-kafka/src/test/scala/old/checkpoint/TestKafkaCheckpointManager.scala 2c0304f98eb0de6c644f55d6a758a7a20ec98e0e > samza-kafka/src/test/scala/org/apache/samza/checkpoint/kafka/TeskKafkaCheckpointLogKey.scala PRE-CREATION > samza-kafka/src/test/scala/org/apache/samza/checkpoint/kafka/TestKafkaCheckpointManager.scala PRE-CREATION > samza-kafka/src/test/scala/org/apache/samza/migration/TestKafkaCheckpointMigration.scala PRE-CREATION > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java b20e3516190aa65c4393fe9a50d6c8b7e7eb7f0b > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java 08e53aaf3aaebccf80e79313c3f38fec38359e81 > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestSamzaTaskManager.java b12ae5c1eaaee8e94d6e62a925a98d2c952fdb72 > samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterLifecycle.scala ec5a8533c7a31b9790504e18e0528be28c77d496 > > Diff: https://reviews.apache.org/r/39806/diff/ > > > Testing > ------- > > ./gradlew clean build > > Tests: > 1. First time job deployment performs migration - DONE > 2. Second time job deployment only performs migration check and doesn't actually migrate anything - DONE > 3. Checkpoint Tool works as expected - DONE > 4. Broadcast Stream works as expected - DONE > 5. FileSystemCheckpointManager works as expected - DONE > > > Thanks, > > Navina Ramesh > > --===============7875327696436055764==--