Return-Path: X-Original-To: apmail-hama-dev-archive@www.apache.org Delivered-To: apmail-hama-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1EBC7105FD for ; Sun, 5 May 2013 22:52:18 +0000 (UTC) Received: (qmail 5547 invoked by uid 500); 5 May 2013 22:52:17 -0000 Delivered-To: apmail-hama-dev-archive@hama.apache.org Received: (qmail 5521 invoked by uid 500); 5 May 2013 22:52:17 -0000 Mailing-List: contact dev-help@hama.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hama.apache.org Delivered-To: mailing list dev@hama.apache.org Received: (qmail 5511 invoked by uid 99); 5 May 2013 22:52:17 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 05 May 2013 22:52:17 +0000 Received: from localhost (HELO mail-ia0-f171.google.com) (127.0.0.1) (smtp-auth username edwardyoon, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Sun, 05 May 2013 22:52:16 +0000 Received: by mail-ia0-f171.google.com with SMTP id r13so2742448iar.2 for ; Sun, 05 May 2013 15:52:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type:x-gm-message-state; bh=/Lgb5wG8RnNwhfZIc2kOWrOdFRaVyjM/dmNMFrZWtCY=; b=bxl0dvkBjhNJwk8UPrNArZBsx7y/W4aElM6xsFZmM1csem327b4FPdOqgXv1n5KHpP RmGE7HcxBOGSa403MROwKmaaXfEqUKrCmOpSyQYBXi+MHDqOa0uwYj6Z95zUZduRSJRf wZRSVs+H9xGGzxMzJBt0+312StvbaQs29P7hu6cZVqdT+h0TbyfhTXzrJQ4hJYRhiAFe IUfDmZzNud1GUL6nOXry9ihmkpA4sXOs/JL5JmHBpqHmfZPfDwgkl5YalWMfaEPAggPp 3AWhArt4SIvjHmoVEutBgtjH24TdO+qRvxQW5Oeyh9V2Ie0sr++l93K4giUcN/rcOuhm 0k2g== MIME-Version: 1.0 X-Received: by 10.50.103.102 with SMTP id fv6mr1862035igb.6.1367794335856; Sun, 05 May 2013 15:52:15 -0700 (PDT) Received: by 10.64.33.106 with HTTP; Sun, 5 May 2013 15:52:15 -0700 (PDT) In-Reply-To: References: Date: Mon, 6 May 2013 07:52:15 +0900 Message-ID: Subject: Re: Issues about Partitioning and Record converter From: "Edward J. Yoon" To: "dev@hama.apache.org" Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQmOVunKpENaVdH9v0TU4MyUvqHvKr+40cYLjUMCQ3jfUNwDN164rsWhY7axopDPWMr8NxD7 >> Please let me know before this is changed, I would like to work on a >> separate branch. I personally, we have to focus on high priority tasks. and more feedbacks and contributions from users. So, if changes made, I'll release periodically. If you want to work on another place, please do. I don't want to wait your patches. On Mon, May 6, 2013 at 7:49 AM, Edward J. Yoon wrote: > For preparing integration with NoSQLs, of course, maybe condition > check (whether converted or not) can be used without removing record > converter. > > We need to discuss everything. > > On Mon, May 6, 2013 at 7:11 AM, Suraj Menon wrote: >> I am still -1 if this means our graph module can work only on sequential >> file format. >> Please note that you can set record converter to null and make changes to >> loadVertices for what you desire here. >> >> If we came to this design, because TextInputFormat is inefficient, would >> this work for Avro or Thrift input format? >> Please let me know before this is changed, I would like to work on a >> separate branch. >> You may proceed as you wish. >> >> Regards, >> Suraj >> >> >> On Sun, May 5, 2013 at 4:09 PM, Edward J. Yoon wrote: >> >>> I think 'record converter' should be removed. It's not good idea. >>> Moreover, it's unnecessarily complex. To keep vertex input reader, we >>> can move related classes into common module. >>> >>> Let's go with my original plan. >>> >>> On Sat, May 4, 2013 at 9:32 AM, Edward J. Yoon >>> wrote: >>> > Hi all, >>> > >>> > I'm reading our old discussions about record converter, superstep >>> > injection, and common module: >>> > >>> > - http://markmail.org/message/ol32pp2ixfazcxfc >>> > - http://markmail.org/message/xwtmfdrag34g5xc4 >>> > >>> > To clarify goals and objectives: >>> > >>> > 1. A parallel input partition is necessary for obtaining scalability >>> > and elasticity of a Bulk Synchronous Parallel processing (It's not a >>> > memory issue, or Disk/Spilling Queue, or HAMA-644. Please don't >>> > shake). >>> > 2. Input partitioning should be handled at BSP framework level, and it >>> > is for every Hama jobs, not only for Graph jobs. >>> > 3. Unnecessary I/O Overhead need to be avoided, and NoSQLs input also >>> > should be considered. >>> > >>> > The current problem is that every input of graph jobs should be >>> > rewritten on HDFS. If you have a good idea, Please let me know. >>> > >>> > -- >>> > Best Regards, Edward J. Yoon >>> > @eddieyoon >>> >>> >>> >>> -- >>> Best Regards, Edward J. Yoon >>> @eddieyoon >>> > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon