Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6F2A1D0DE for ; Tue, 21 Aug 2012 19:56:39 +0000 (UTC) Received: (qmail 45813 invoked by uid 500); 21 Aug 2012 19:56:39 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 45764 invoked by uid 500); 21 Aug 2012 19:56:39 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 45591 invoked by uid 500); 21 Aug 2012 19:56:38 -0000 Delivered-To: apmail-incubator-giraph-dev@incubator.apache.org Received: (qmail 45537 invoked by uid 99); 21 Aug 2012 19:56:38 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Aug 2012 19:56:38 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 4C4B92C5C06 for ; Tue, 21 Aug 2012 19:56:38 +0000 (UTC) Date: Wed, 22 Aug 2012 06:56:38 +1100 (NCT) From: "Avery Ching (JIRA)" To: giraph-dev@incubator.apache.org Message-ID: <966169314.36807.1345578998313.JavaMail.jiratomcat@arcas> In-Reply-To: <2018772605.26354.1345276357966.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (GIRAPH-306) Netty requests should be reliable and implement exactly once semantics MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/GIRAPH-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438989#comment-13438989 ] Avery Ching commented on GIRAPH-306: ------------------------------------ Here are some tests with the RandomMessageBenchmark. Note that variance is high. Let's ignore the setup and input superstep times as they vary based on the connection attempts. The superstep times are pretty close. I'm running this in a shared cluster. I don't think the overhead is significant and reliability is important. My arguments: hadoop jar giraph-0.2-SNAPSHOT-for-hadoop-0.20.1-jar-with-dependencies.jar org.apache.giraph.benchmark.RandomMessageBenchmark -Dmapred.child.java.opts="-Xms1500m -Xmx1500m -Xss160k" -Dgiraph.useNetty=true -Dmapred.map.max.attempts=1 -Dmapred.fairscheduler.pool=di.graphonly -Dmapreduce.job.user.classpath.first=true -s 5 -e 1 -v -w 100 -n 40 -b 1024 -V 1000000 Before GIRAPH-306: 12/08/21 12:37:02 INFO mapred.JobClient: Giraph Timers 12/08/21 12:37:02 INFO mapred.JobClient: Total (milliseconds)=209108 12/08/21 12:37:02 INFO mapred.JobClient: Superstep 3 (milliseconds)=16142 12/08/21 12:37:02 INFO mapred.JobClient: Setup (milliseconds)=88873 12/08/21 12:37:02 INFO mapred.JobClient: Vertex input superstep (milliseconds)=37255 12/08/21 12:37:02 INFO mapred.JobClient: Shutdown (milliseconds)=1251 12/08/21 12:37:02 INFO mapred.JobClient: Superstep 0 (milliseconds)=15088 12/08/21 12:37:02 INFO mapred.JobClient: Superstep 4 (milliseconds)=16251 12/08/21 12:37:02 INFO mapred.JobClient: Superstep 5 (milliseconds)=1529 12/08/21 12:37:02 INFO mapred.JobClient: Superstep 2 (milliseconds)=16043 12/08/21 12:37:02 INFO mapred.JobClient: Superstep 1 (milliseconds)=16671 12/08/21 12:27:05 INFO mapred.JobClient: Giraph Timers 12/08/21 12:27:05 INFO mapred.JobClient: Total (milliseconds)=269081 12/08/21 12:27:05 INFO mapred.JobClient: Superstep 3 (milliseconds)=14929 12/08/21 12:27:05 INFO mapred.JobClient: Setup (milliseconds)=46770 12/08/21 12:27:05 INFO mapred.JobClient: Vertex input superstep (milliseconds)=123033 12/08/21 12:27:05 INFO mapred.JobClient: Shutdown (milliseconds)=1359 12/08/21 12:27:05 INFO mapred.JobClient: Superstep 0 (milliseconds)=17098 12/08/21 12:27:05 INFO mapred.JobClient: Superstep 4 (milliseconds)=16759 12/08/21 12:27:05 INFO mapred.JobClient: Superstep 5 (milliseconds)=11882 12/08/21 12:27:05 INFO mapred.JobClient: Superstep 2 (milliseconds)=18835 12/08/21 12:27:05 INFO mapred.JobClient: Superstep 1 (milliseconds)=18409 12/08/21 12:41:31 INFO mapred.JobClient: Giraph Timers 12/08/21 12:41:31 INFO mapred.JobClient: Total (milliseconds)=191158 12/08/21 12:41:31 INFO mapred.JobClient: Superstep 3 (milliseconds)=19005 12/08/21 12:41:31 INFO mapred.JobClient: Setup (milliseconds)=49267 12/08/21 12:41:31 INFO mapred.JobClient: Vertex input superstep (milliseconds)=39635 12/08/21 12:41:31 INFO mapred.JobClient: Shutdown (milliseconds)=2483 12/08/21 12:41:31 INFO mapred.JobClient: Superstep 0 (milliseconds)=20668 12/08/21 12:41:31 INFO mapred.JobClient: Superstep 4 (milliseconds)=17100 12/08/21 12:41:31 INFO mapred.JobClient: Superstep 5 (milliseconds)=7636 12/08/21 12:41:31 INFO mapred.JobClient: Superstep 2 (milliseconds)=17253 12/08/21 12:41:31 INFO mapred.JobClient: Superstep 1 (milliseconds)=18106 After GIRAPH-306: 12/08/21 12:46:35 INFO mapred.JobClient: Giraph Timers 12/08/21 12:46:35 INFO mapred.JobClient: Total (milliseconds)=233213 12/08/21 12:46:35 INFO mapred.JobClient: Superstep 3 (milliseconds)=13991 12/08/21 12:46:35 INFO mapred.JobClient: Setup (milliseconds)=81516 12/08/21 12:46:35 INFO mapred.JobClient: Vertex input superstep (milliseconds)=68620 12/08/21 12:46:35 INFO mapred.JobClient: Shutdown (milliseconds)=794 12/08/21 12:46:35 INFO mapred.JobClient: Superstep 0 (milliseconds)=16599 12/08/21 12:46:35 INFO mapred.JobClient: Superstep 4 (milliseconds)=15746 12/08/21 12:46:35 INFO mapred.JobClient: Superstep 5 (milliseconds)=1543 12/08/21 12:46:35 INFO mapred.JobClient: Superstep 2 (milliseconds)=16284 12/08/21 12:46:35 INFO mapred.JobClient: Superstep 1 (milliseconds)=18110 12/08/21 12:53:02 INFO mapred.JobClient: Giraph Timers 12/08/21 12:53:02 INFO mapred.JobClient: Total (milliseconds)=285832 12/08/21 12:53:02 INFO mapred.JobClient: Superstep 3 (milliseconds)=15915 12/08/21 12:53:02 INFO mapred.JobClient: Setup (milliseconds)=48762 12/08/21 12:53:02 INFO mapred.JobClient: Vertex input superstep (milliseconds)=152074 12/08/21 12:53:02 INFO mapred.JobClient: Shutdown (milliseconds)=2438 12/08/21 12:53:02 INFO mapred.JobClient: Superstep 0 (milliseconds)=18609 12/08/21 12:53:02 INFO mapred.JobClient: Superstep 4 (milliseconds)=14075 12/08/21 12:53:02 INFO mapred.JobClient: Superstep 5 (milliseconds)=2248 12/08/21 12:53:02 INFO mapred.JobClient: Superstep 2 (milliseconds)=15277 12/08/21 12:53:02 INFO mapred.JobClient: Superstep 1 (milliseconds)=16422 > Netty requests should be reliable and implement exactly once semantics > ---------------------------------------------------------------------- > > Key: GIRAPH-306 > URL: https://issues.apache.org/jira/browse/GIRAPH-306 > Project: Giraph > Issue Type: Improvement > Reporter: Avery Ching > Assignee: Avery Ching > Priority: Critical > Attachments: GIRAPH-306.2.patch, GIRAPH-306.patch > > > One of the biggest scalability challenges is getting Giraph to run reliably on a large number of tasks (i.e. > 200). Several problems exist: > 1) If the connection fails after the initial connection was made, the job will die. > 2) Requests must be completed exactly once. This is difficult to implement, but required since we cannot have multiple retried requests succeed (i.e. a vertex gets more messages than expected). > 3) Sometimes there are unresolved addresses, causing failure. > This patch addresses these issues by re-establishing failed connections and keep tracking of every request sent to every worker. If the request fails or passes a timeout, it will be resent. The server will keep track of requests that succeeded to insure that the same request won't be processed more than once. The structure for keeping track of the succeeded requests on the server is efficient for handling increasing request ids (IncreasingBitSet). For handling unresolved addresses, I added retry logic to keep trying to resolve the problem. > This patch also adds several unit tests that use fault injection to simulate a lost response or a closed channel exception on the server. It also has unittests for IncreasingBitSet to insure it is working correctly and efficiently. > This passes all unittests (including the new ones). Additionally, I have some experience results as well. > Previously, I was unable to run reliably with more than 200 workers. With this change I can reliably run 500+ workers. I also ran with 600 workers successfully. This is a really big reliability win for us. > I can see the code working to do reconnections and re-issue requests when necessary. It's very cool. > I.e. > 2012-08-18 00:16:52,109 INFO org.apache.giraph.comm.NettyClient: checkAndFixChannel: Fixing disconnected channel to xxx.xxx.xxx.xxx/xx.xx.xx.xx:30455, open = false, bound = false > 2012-08-18 00:16:52,111 INFO org.apache.giraph.comm.NettyClient: checkAndFixChannel: Connected to xxx.xxx.xxx.xxx/xxx.xxx.xxx.xxx:30455! > 2012-08-18 00:16:52,123 INFO org.apache.giraph.comm.NettyClient: checkAndFixChannel: Fixing disconnected channel to xxx.xxx.xxx.xxx/xxx.xxx.xxx.xxx, open = false, bound = false > 2012-08-18 00:16:52,124 INFO org.apache.giraph.comm.NettyClient: checkAndFixChannel: Connected to xxx.xxx.xxx.xxx/xxx.xxx.xxx.xxx:30117! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira