Return-Path: X-Original-To: apmail-hama-user-archive@www.apache.org Delivered-To: apmail-hama-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DF58910E50 for ; Wed, 20 Nov 2013 13:59:07 +0000 (UTC) Received: (qmail 93833 invoked by uid 500); 20 Nov 2013 13:59:06 -0000 Delivered-To: apmail-hama-user-archive@hama.apache.org Received: (qmail 93649 invoked by uid 500); 20 Nov 2013 13:59:04 -0000 Mailing-List: contact user-help@hama.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hama.apache.org Delivered-To: mailing list user@hama.apache.org Received: (qmail 93634 invoked by uid 99); 20 Nov 2013 13:59:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Nov 2013 13:59:03 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of smcvbeelen@gmail.com designates 209.85.212.47 as permitted sender) Received: from [209.85.212.47] (HELO mail-vb0-f47.google.com) (209.85.212.47) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Nov 2013 13:58:57 +0000 Received: by mail-vb0-f47.google.com with SMTP id x11so2779992vbb.6 for ; Wed, 20 Nov 2013 05:58:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=ngWjbW+v3AXjmm1MEEEcLxUg1u3rt5WW+CA7b/4y2pM=; b=WhG7G1bZ/1bscdOAdNxtX/7Mec31HaXpWmEzpx61dXjFLlAKT7AG+9j+7rCKjLFKNa rj97F5PFp85b2M1CzQbT6G4FXNQcFyRBOLgVVZMDEGJuV3AU/ZzBiUVlyVUddq2fwfUr 6fbmb9ZAMebGKKYUYsD+amtjIPtl+zvuAGbSwlBve33xC6o/bR0N4cUHEuFBdX5yssoA aPDUAssnsIPhV3fXuWKCk+QRXxBTFBh63rq+dQ7KtCgwWW8uB1EEn09kGDR2J8VW0PiZ cbGdcybHx9VJSwywPkoGo+5ZIgNjnOO9gjvaM2H7DjULE4oGHm19vJcSg/i99kzvvkW1 UasA== MIME-Version: 1.0 X-Received: by 10.52.182.39 with SMTP id eb7mr446172vdc.6.1384955917150; Wed, 20 Nov 2013 05:58:37 -0800 (PST) Received: by 10.59.8.132 with HTTP; Wed, 20 Nov 2013 05:58:37 -0800 (PST) In-Reply-To: References: Date: Wed, 20 Nov 2013 14:58:37 +0100 Message-ID: Subject: Re: HAMA jobs failing, with no debug message - 2 From: Steven van Beelen To: user@hama.apache.org Content-Type: multipart/alternative; boundary=bcaec547ca8d78afd704eb9c2dae X-Virus-Checked: Checked by ClamAV on apache.org --bcaec547ca8d78afd704eb9c2dae Content-Type: text/plain; charset=ISO-8859-1 Can I combine the Spilling Queue with the Sorted Message Queue? (e.g. conf.set(MessageManager.QUEUE_TYPE_CLASS, "org.apache.hama.bsp.message.queue.SortedMessageQueue");) My implementation inclines the messages to be received sorted, hence the question. My program has only one superstep. It is an implementation of Inverted Indexing which first reads in a Sequence File consisting of pairs where the key is a Text object and the value a IntWritable. The program first parses the Texts Objects, stores each separate word and its frequency. After each document, it sends a messages to another peer containing the word, document id and the frequency. If all the documents have been worked through, sync() is called. After that, a list is created for every word, consisting of all the pairs found. On Wed, Nov 20, 2013 at 2:40 PM, Edward J. Yoon wrote: > Why don't you use Spilling Queue? Then, it'll work without no problem. > > >> > Last note: I'm running an Inverted Indexing algorithm with a data set > of > >> > approximately 17 GB. > > How many supersteps is needed? If your job is too > communication-intensive, maybe you should consider another approach. > > On Wed, Nov 20, 2013 at 10:14 PM, Steven van Beelen > wrote: > > Hi Edward, > > > > That was the issue I was thinking of first. So, I increased > > bsp.child.java.opts to 8Gb and that of the Groomservers to 4Gb. > > After that, the 84-tasks run worked, but with 60 tasks it fails as said > > above. > > Should I give it more memory? I would think that these amounts per > > task/Groomserver should be enough. > > > > Regars, Steven > > > > > > > > On Wed, Nov 20, 2013 at 12:16 PM, Edward J. Yoon >wrote: > > > >> > The only case the program does run, is when I use the maximum number > of > >> > machines (i.e. 7 machines, with 12 cores, 128GB ram..). I set the > maximum > >> > number of tasks to 12 per node, thus 84. But when I force the program > to > >> run > >> > with 60 tasks, the "Job Failed" comes up with no additional info. > >> > >> Your case looks like a memory problem. Can you check the memory space > >> during job execution? or try to increase the max heap of BSP child > >> JVM. > >> > >> > the "Job Failed" comes up with no additional info. > >> > >> Sorry for the inconvenience, i'll check it out and see what's wrong. > >> > >> On Wed, Nov 20, 2013 at 6:22 PM, Steven van Beelen < > smcvbeelen@gmail.com> > >> wrote: > >> > I have a very similar problem as Anveshi Charuvaka is mailing about. > >> > > >> > What I found additionally when I set task logging to DEBUG mode, is > that > >> the > >> > DEBUG logs get interrupted at same point and replaced with the "INFO > >> > bsp.BSPJobClient: Job failed." message. > >> > My program works in local, distributed and pseudo mode, so that's > >> probably > >> > not the issue. > >> > > >> > The only case the program does run, is when I use the maximum number > of > >> > machines (i.e. 7 machines, with 12 cores, 128GB ram..). I set the > maximum > >> > number of tasks to 12 per node, thus 84. But when I force the program > to > >> run > >> > with 60 tasks, the "Job Failed" comes up with no additional info. > >> > > >> > Last note: I'm running an Inverted Indexing algorithm with a data set > of > >> > approximately 17 GB. > >> > Could someone help me with this? > >> > > >> > Regards, Steven > >> > >> > >> > >> -- > >> Best Regards, Edward J. Yoon > >> @eddieyoon > >> > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon > --bcaec547ca8d78afd704eb9c2dae--