Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA1DBDD53 for ; Fri, 10 Aug 2012 23:11:38 +0000 (UTC) Received: (qmail 82709 invoked by uid 500); 10 Aug 2012 23:11:38 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 82617 invoked by uid 500); 10 Aug 2012 23:11:37 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 82609 invoked by uid 99); 10 Aug 2012 23:11:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Aug 2012 23:11:37 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of write2vishal@gmail.com designates 209.85.212.176 as permitted sender) Received: from [209.85.212.176] (HELO mail-wi0-f176.google.com) (209.85.212.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Aug 2012 23:11:31 +0000 Received: by wibhn17 with SMTP id hn17so1481077wib.11 for ; Fri, 10 Aug 2012 16:11:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=pi9lwedfIVIl6i0x/pVuFkk6RvXLgwEACUBwPYTOOJU=; b=rQJSyEHfOR2WG3ILwEpdielzSrh4t6fC+evXyvgQBrc+Pu7Ser05Rmdk+m1sutKvO2 pknQPW7RHM/lMPUIrmhNqyMoK8l5znfXXwbJvpn3n5IFLD7dKU5/Vwn24lw0znFVlI0T /E0cBwfZmtM5xsGkQDftXk2/cdnYzVv0vtFCsHg/Oz+64s077ena98lV/7d1lTAuEcQB icb8X/z4dKjn05MoxsIh2ClJTgnReJDCZEZP70fLa6BEpuyyuErEP8Lh26i97rdOAsM5 IVaEXv8VOGnjZ7NR2j3VBZQmtZ8U9Km/5csb9eL+7GVDo9ba2x6zXki3TQIA0LFjxWj8 X4Mg== Received: by 10.180.14.8 with SMTP id l8mr9497641wic.6.1344640263420; Fri, 10 Aug 2012 16:11:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.11.41 with HTTP; Fri, 10 Aug 2012 16:10:33 -0700 (PDT) From: Vishal Patel Date: Fri, 10 Aug 2012 16:10:33 -0700 Message-ID: Subject: Saving checkpoints? To: user Content-Type: multipart/alternative; boundary=f46d04138a59407c0204c6f1755c X-Virus-Checked: Checked by ClamAV on apache.org --f46d04138a59407c0204c6f1755c Content-Type: text/plain; charset=ISO-8859-1 Hi, How do I specify the interval for saving checkpoints? When working with Amazon's Elastic Mapreduce on a large number of workers (> 80 workers, 40 x m1.xlarge machines), sometimes there is RPC communication errors and Zookeeper waits on that worker for a while before timing out and killing the job all together. As my graph and number of workers is becoming larger I would like to learn how to save it since that extra cost might be well worth it-- say every 50 supersteps. Here is the command I use currently, how should I modify it. hadoop jar giraph-0.2-SNAPSHOT-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedComponentsVertex \ --inputFormat org.apache.giraph.examples.IntIntNullIntTextInputFormat \ --inputPath giraph_in/adj_list.txt \ --outputFormat org.apache.giraph.examples.VertexWithComponentTextOutputFormat \ --outputPath giraph_out --combiner org.apache.giraph.examples.MinimumIntCombiner --workers 95 Also, how do I restart from a specific checkpoint. The help for the GiraphRunner class did not have instructions on this. Thank you! Vishal --f46d04138a59407c0204c6f1755c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,=A0

How do I specify the interval for savi= ng checkpoints? When working with Amazon's Elastic Mapreduce on a large= number of workers (> 80 workers, 40 x m1.xlarge machines), sometimes th= ere is RPC communication errors and Zookeeper waits on that worker for a wh= ile before timing out and killing the job all together. =A0

As my graph and number of workers is becoming larger I = would like to learn how to save it since that extra cost might be well wort= h it-- say every 50 supersteps. Here is the command I use currently, how sh= ould I modify it.=A0

hadoop jar giraph-0.2-SNAPSHOT-jar-with-dependencies.ja= r org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedCompon= entsVertex \
--inputFormat org.apache.giraph.examples.IntIntNullI= ntTextInputFormat \
--inputPath giraph_in/adj_list.txt \
--outputFormat org.apac= he.giraph.examples.VertexWithComponentTextOutputFormat \
--output= Path giraph_out
--combiner org.apache.giraph.examples.MinimumIntC= ombiner
--workers 95 =A0

Also, how do I restart from = a specific checkpoint. The help for the GiraphRunner class did not have ins= tructions on this.=A0

Thank you!=A0

=
Vishal=A0


--f46d04138a59407c0204c6f1755c--