Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1032CD33F for ; Wed, 22 Aug 2012 05:12:57 +0000 (UTC) Received: (qmail 36093 invoked by uid 500); 22 Aug 2012 05:12:56 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 35830 invoked by uid 500); 22 Aug 2012 05:12:54 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 35795 invoked by uid 99); 22 Aug 2012 05:12:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Aug 2012 05:12:53 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of amani.alonazi@kaust.edu.sa designates 209.85.210.180 as permitted sender) Received: from [209.85.210.180] (HELO mail-iy0-f180.google.com) (209.85.210.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Aug 2012 05:12:47 +0000 Received: by iadj38 with SMTP id j38so500933iad.11 for ; Tue, 21 Aug 2012 22:12:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :x-gm-message-state; bh=qnGZ0jCtfCMi/RL+p3Ysib9tUWM+YOiQXo21TM2BK3c=; b=AqwQe2w43js4OTpV9gl6wwzPh7jXZabOfHN/b360r8/u/YlPc+FuEJudWtBxmarJuY D1PLwbM0Qmj3+MP3YY1mTbiOpUWiGsgFqCt10DIs4W9y/1g1OjGxmjq+wyI2xEBxnJ6N 4I8wBzwPiZoPrv+SOCv0Dl2bqPucWFinvfi54eDjWDM6PioGE1xyYKdLAbTNhbWscA1l vM9UrQxeUdb5GtIsSDVbY9U4FdgTUsslG3EUdafOv9MVe7FSbis0KqnStklz2yzAMrLN jBKD/uuh973RQsjIqwyMgEQjWoj2Ssn+bfxXD4P3rvJPAwq7bUC7S7MuShau0Ia7+PCY s7rA== Received: by 10.50.194.167 with SMTP id hx7mr862369igc.24.1345612345393; Tue, 21 Aug 2012 22:12:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.194.167 with SMTP id hx7mr862360igc.24.1345612345222; Tue, 21 Aug 2012 22:12:25 -0700 (PDT) Received: by 10.64.53.166 with HTTP; Tue, 21 Aug 2012 22:12:25 -0700 (PDT) Date: Wed, 22 Aug 2012 08:12:25 +0300 Message-ID: Subject: Giraph Job "Task attempt_* failed to report status" Problem From: Amani Alonazi To: user@giraph.apache.org, dev@giraph.apache.org Content-Type: multipart/alternative; boundary=14dae9399c2fd7a72704c7d3c9a9 X-Gm-Message-State: ALoCoQl1j4JUlm25P6/iCcH61Oeoh59b2OJ+b+S5zwz70bYzuq0DC/xuaUrkOZDqVMtniZ4+6pcq3KNUoiiTB0x705loJxCTOw== X-Virus-Checked: Checked by ClamAV on apache.org --14dae9399c2fd7a72704c7d3c9a9 Content-Type: text/plain; charset=US-ASCII Hi all, I'm running a minimum spanning tree compute function on Hadoop cluster (20 machines). After certain supersteps (e.g. superstep 47 for a graph of 4,194,304 vertices and 181,566,970 edges), the execution time increased dramatically. This is not the only problem, the job has been killed "Task attempt_* failed to report status for 601 seconds. Killing! " I disabled the checkpoint feature by setting the "CHECKPOINT_FREQUENCY_DEFAULT = 0" in GiraphJob.java. I don't need to write any data to disk neither snapshots nor output. I tested the algorithm on sample graph of 7 vertices and it works well. Is there any way to profile or debug Giraph job? In the Giraph Stats the "Aggregate finished vertices" counter is it for the vertices which voted to halt? Also the "sent messages" counter, is it per each superstep or the total msgs? If a vertex vote to halt, will it be activated upon receiving messages? Thanks a lot! Best, Amani AlOnazi MSc Computer Science King Abdullah University of Science and Technology Kingdom of Saudi Arabia -- ------------------------------ This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. --14dae9399c2fd7a72704c7d3c9a9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi all,

I'm running a minimum spanning tree com= pute function on Hadoop cluster (20 machines). After certain supersteps (e.= g. superstep 47 for a graph of 4,194,304 vertices and 181,566,970 edges), t= he execution time increased dramatically. This is not the only problem, the= job has been killed "Task attempt_* failed to report status for 601 s= econds. Killing! "

I disabled the checkpoint feature by setting the "CHECKPOINT_FREQU= ENCY_DEFAULT =3D 0" in GiraphJob.java. I don't need to write any d= ata to disk neither snapshots nor output. I tested the algorithm on sample = graph of 7 vertices and it works well.

Is there any way to profile or debug Giraph job?
In the Giraph Stat= s the "Aggregate finished vertices" counter is it for the vertice= s which voted to halt? Also the "sent messages" counter, is it pe= r each superstep or the total msgs?
If a vertex vote to halt, will it be activated upon receiving messages?=A0 =

Thanks a lot!

Best,
Amani AlOnazi
MSc Computer Science
King Abdullah University of Science and Technology
Kingdom of Saudi Arabia=



This message and its content= s, including attachments are intended solely for the original recipient. If= you are not the intended recipient or have received this message in error,= please notify me immediately and delete this message from your computer sy= stem. Any unauthorized use or distribution is prohibited. Please consider t= he environment before printing this email. --14dae9399c2fd7a72704c7d3c9a9--