Return-Path: X-Original-To: apmail-hama-dev-archive@www.apache.org Delivered-To: apmail-hama-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9FC5DD362 for ; Mon, 10 Dec 2012 11:22:24 +0000 (UTC) Received: (qmail 87409 invoked by uid 500); 10 Dec 2012 11:22:24 -0000 Delivered-To: apmail-hama-dev-archive@hama.apache.org Received: (qmail 87050 invoked by uid 500); 10 Dec 2012 11:22:24 -0000 Mailing-List: contact dev-help@hama.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hama.apache.org Delivered-To: mailing list dev@hama.apache.org Received: (qmail 86486 invoked by uid 99); 10 Dec 2012 11:22:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Dec 2012 11:22:23 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of thomas.jungblut@gmail.com designates 209.85.216.47 as permitted sender) Received: from [209.85.216.47] (HELO mail-qa0-f47.google.com) (209.85.216.47) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Dec 2012 11:22:16 +0000 Received: by mail-qa0-f47.google.com with SMTP id a19so1523917qad.13 for ; Mon, 10 Dec 2012 03:21:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=+lwQSUuvCdr9caBexiZ+QAj7Rrnpa0PgN69vf9ctChI=; b=cxtG0reW7/Pl4Lkx4bhJS1Bf8ybPp1KNJ07D1qRidiU/YMrPm4JEdFuSznWJ6zYsQq dJuqqFLZTzHtFryN23gd877F1Gn1QiCnQYjg9y+R61wwFTN8qWPiSqYgEKLpobB6Qsj9 WOUldF9NEmG0ZXjUUUO330WnpypRIWP8eRUC2b62ecpvV9fwbsCLiqmjm6TwvaF+r7FR ec0kpAVW0cQyz9YGjrvaG/gI4ZaxH9cFdc4EjnxZgnBJs0PBE2X7ft49bJI+CnMpQ7hV HQmXevpeC8Ro2aCgHsivV0Oe0hcAhMVCP4J1RAkEtOqvdV015xDpBinjRFFerOL7axNB T5Cg== MIME-Version: 1.0 Received: by 10.229.176.216 with SMTP id bf24mr6287823qcb.21.1355138515912; Mon, 10 Dec 2012 03:21:55 -0800 (PST) Received: by 10.49.1.2 with HTTP; Mon, 10 Dec 2012 03:21:55 -0800 (PST) In-Reply-To: References: Date: Mon, 10 Dec 2012 12:21:55 +0100 Message-ID: Subject: Re: runtimePartitioning in GraphJobRunner From: Thomas Jungblut To: dev@hama.apache.org Content-Type: multipart/alternative; boundary=0016369f9b54dcc82c04d07dc5fa X-Virus-Checked: Checked by ClamAV on apache.org --0016369f9b54dcc82c04d07dc5fa Content-Type: text/plain; charset=ISO-8859-1 Yes, because changing the blocksize to 32m will just use 300mb of memory, so you can add more machines to fit the number of resulting tasks. If each node have small memory, there's no way to process in memory Yes, so spilling on disk is the easiest solution to save memory. Not changing the partitioning. If you want to split again through the block boundaries to distribute the data through the cluster, then do it, but this is plainly wrong. 2012/12/10 Edward J. Yoon > > A Hama cluster is scalable. It means that the computing capacity > >> should be increased by adding slaves. Right? > > > > > > I'm sorry, but I don't see how this relates to the vertex input reader. > > Not related with input reader. It related with partitioning and load > balancing. As I reported to you before, to process vertices within > 256MB block, each TaskRunner requied 25~30GB memory. > > If each node have small memory, there's no way to process in memory > without changing block size of HDFS. > > Do you think this is scalable? > > On Mon, Dec 10, 2012 at 7:59 PM, Thomas Jungblut > wrote: > > Oh okay, so if you want to remove that, have a lot of fun. This reader is > > needed, so people can create vertices from their own fileformat. > > Going back to a sequencefile input will not only break backward > > compatibility but also make the same issues we had before. > > > > A Hama cluster is scalable. It means that the computing capacity > >> should be increased by adding slaves. Right? > > > > > > I'm sorry, but I don't see how this relates to the vertex input reader. > > > > 2012/12/10 Edward J. Yoon > > > >> A Hama cluster is scalable. It means that the computing capacity > >> should be increased by adding slaves. Right? > >> > >> As I mentioned before, disk-queue and storing vertices on local disk > >> are not urgent. > >> > >> In short, yeah, I wan to remove VertexInputReader and runtime > >> partition in Graph package. > >> > >> See also, > >> > https://issues.apache.org/jira/browse/HAMA-531?focusedCommentId=13527756&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13527756 > >> > >> On Mon, Dec 10, 2012 at 7:31 PM, Thomas Jungblut > >> wrote: > >> > uhm, I have no idea what you want to archieve, do you want to get > back to > >> > client-side partitioning? > >> > > >> > 2012/12/10 Edward J. Yoon > >> > > >> >> If there's no opinion, I'll remove VertexInputReader in > >> >> GraphJobRunner, because it make code complex. Let's consider again > >> >> about the VertexInputReader, after fixing HAMA-531 and HAMA-632 > >> >> issues. > >> >> > >> >> On Fri, Dec 7, 2012 at 9:35 AM, Edward J. Yoon < > edwardyoon@apache.org> > >> >> wrote: > >> >> > Or, I'd like to get rid of VertexInputReader. > >> >> > > >> >> > On Fri, Dec 7, 2012 at 9:30 AM, Edward J. Yoon < > edwardyoon@apache.org > >> > > >> >> wrote: > >> >> >> In fact, there's no choice but to use runtimePartitioning > (because of > >> >> >> VertexInputReader). Right? If so, I would like to delete all "if > >> >> >> (runtimePartitioning) {" conditions. > >> >> >> > >> >> >> -- > >> >> >> Best Regards, Edward J. Yoon > >> >> >> @eddieyoon > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > Best Regards, Edward J. Yoon > >> >> > @eddieyoon > >> >> > >> >> > >> >> > >> >> -- > >> >> Best Regards, Edward J. Yoon > >> >> @eddieyoon > >> >> > >> > >> > >> > >> -- > >> Best Regards, Edward J. Yoon > >> @eddieyoon > >> > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon > --0016369f9b54dcc82c04d07dc5fa--