Return-Path: X-Original-To: apmail-hama-dev-archive@www.apache.org Delivered-To: apmail-hama-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1EBBA11EB6 for ; Fri, 15 Aug 2014 15:01:23 +0000 (UTC) Received: (qmail 82033 invoked by uid 500); 15 Aug 2014 15:01:23 -0000 Delivered-To: apmail-hama-dev-archive@hama.apache.org Received: (qmail 82016 invoked by uid 500); 15 Aug 2014 15:01:23 -0000 Mailing-List: contact dev-help@hama.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hama.apache.org Delivered-To: mailing list dev@hama.apache.org Received: (qmail 82004 invoked by uid 99); 15 Aug 2014 15:01:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Aug 2014 15:01:22 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of clin4j@googlemail.com designates 209.85.220.43 as permitted sender) Received: from [209.85.220.43] (HELO mail-pa0-f43.google.com) (209.85.220.43) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Aug 2014 15:00:57 +0000 Received: by mail-pa0-f43.google.com with SMTP id lf10so3601167pab.16 for ; Fri, 15 Aug 2014 08:00:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Kg3ZZ+4ATg6iEAACp01f11fhuSkfLfR84kFqkJTcBuk=; b=dl26NvI/cWgQqqtLQBlu844UZWWp/Q6pdoQDydklOqTPw1Ke1cLixC0FThE/senG/U BvNyalGtTpGhhVZfm1u0CfTVPX+JCgwfAsWX3SnIaRAIXUXRWXeACT5mcm1WEDpejLTW vhn1Pko46VTmqC/UEvdF5CeN2FifpTr02fMKs+XQ85fx/YNnUls+iP7822OLig3j+t8H wzWLn08IAHfJgAMS+5Y6QkpCadjklto13662/eiFHsfVLJPJoC1xKnubtUJ9Fg94pP8C 7nKbrrVQ1MTIY/kn1wtAD8SneOH6smd6CbaGMdT9RlZIbFYuLi5eliCHeB6ezeSvVxde pIEg== MIME-Version: 1.0 X-Received: by 10.70.102.200 with SMTP id fq8mr10707849pdb.152.1408114853920; Fri, 15 Aug 2014 08:00:53 -0700 (PDT) Received: by 10.70.133.199 with HTTP; Fri, 15 Aug 2014 08:00:53 -0700 (PDT) In-Reply-To: References: Date: Fri, 15 Aug 2014 23:00:53 +0800 Message-ID: Subject: Re: Remove Spilling Queue and rewrite checkpoint/recovery From: Chia-Hung Lin To: dev@hama.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org Code right now is at https://github.com/chlin501/hama.git Maven and jdk are required to build the project Command to have a clean build: mvn clean install -DskipTests=true -Dmaven.javadoc.skip=true To test a specific test case: mvn -DskipTests=false -Dtest= test On 15 August 2014 18:21, Suraj Menon wrote: > Hi Edward, sorry to enter the discussion so late. > > Bundling and Unbundling of message queue is not Spilling queue's > responsibility, it was ended up there to be compatible with the existent > implementation of BSP Peer communication. Remember Spilling Queue > implementation was done to immediately remove some OutOfMemory issues on > sender side first. Spilling Queue gives you a byte array (ByteBuffer) with > a batch of serialized messages. This is effectively bundling the messages > in byte array (hence the ByteArrayMessageBundle) and sending them for > processing. The SpilledDataProcessor's are implemented as a pipeline of > processing done using inheritance, something like what we may use trait for > in Scala. So if we have a SpilledDataProcessor that sends this bundled > message via RPC to the peer, there is no need to write them to file and > read them back. As I previously mentioned this was done to be compatible > with the existent implementation of peer.send. > > Also, the async checkpoint recovery code was written before spilling queue. > Today we can remove the single message write and do this in "before peer > sync" phase to just write the whole file to HDFS. > > I would say performance numbers and maintainability comes first and if you > think removing spilling queue is a solution go for it. As far as async > checkpointing is to be considered, that was a first proof of concept we did > and it is high time we move forward from there. > > Chiahung, do you have some instruction on where and how I can build the > scala version of your code? > > I am really finding it hard to dedicate time for Hama these days. > > - Suraj > > > On Tue, Aug 12, 2014 at 7:15 AM, Edward J. Yoon > wrote: > >> ChiaHung, >> >> Yes, I'm thinking similar things. >> >> On Tue, Aug 12, 2014 at 4:11 PM, Chia-Hung Lin >> wrote: >> > I am currently working on this part based on the superstep api, >> > similar to the Superstep.java in the trunk. >> > >> > The checkpointer[1] saves bundle message instead of single message. >> > Not very sure if this is what you are looking for? >> > >> > [1]. >> https://github.com/chlin501/hama/blob/peer-comm-mech-changed/core/src/main/scala/org/apache/hama/monitor/Checkpointer.scala >> > >> > >> > >> > >> > On 12 August 2014 15:04, Edward J. Yoon wrote: >> >> I think that transferring single messages at a time is not a wise way. >> >> Bundle is used to avoid network overheads and contentions. So, if we >> >> use Bundle, each processor always sends/receives an bundles. >> >> >> >> BSPMessageBundle is Writable (and Iterable). And it manages the >> >> serialized message as a byte array. If we write an bundles when >> >> checkpointing or using Disk-queue, it'll be more simple and faster. >> >> >> >> In Spilling Queue case, it always requires the process of unbundling >> >> and putting messages into queue. >> >> >> >> >> >> On Tue, Aug 12, 2014 at 2:41 PM, Tommaso Teofili >> >> wrote: >> >>> -1, can't we first discuss? Also it'd be helpful to be more specific >> on the >> >>> problems. >> >>> Tommaso >> >>> >> >>> >> >>> >> >>> 2014-08-12 4:25 GMT+02:00 Edward J. Yoon : >> >>> >> >>>> All, >> >>>> >> >>>> I'll delete Spilling queue, and rewrite checkpoint/recovery >> >>>> implementation (checkpointing bundles is better than checkpointing all >> >>>> messages). Current implementation is quite mess :/ there are huge >> >>>> deserialization/serialization overheads.. >> >>>> >> >>>> -- >> >>>> Best Regards, Edward J. Yoon >> >>>> CEO at DataSayer Co., Ltd. >> >>>> >> >> >> >> >> >> >> >> -- >> >> Best Regards, Edward J. Yoon >> >> CEO at DataSayer Co., Ltd. >> >> >> >> -- >> Best Regards, Edward J. Yoon >> CEO at DataSayer Co., Ltd. >>