Return-Path: X-Original-To: apmail-hama-user-archive@www.apache.org Delivered-To: apmail-hama-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5E9C218289 for ; Fri, 26 Jun 2015 23:03:55 +0000 (UTC) Received: (qmail 94715 invoked by uid 500); 26 Jun 2015 23:03:55 -0000 Delivered-To: apmail-hama-user-archive@hama.apache.org Received: (qmail 94684 invoked by uid 500); 26 Jun 2015 23:03:55 -0000 Mailing-List: contact user-help@hama.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hama.apache.org Delivered-To: mailing list user@hama.apache.org Received: (qmail 94670 invoked by uid 99); 26 Jun 2015 23:03:54 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Jun 2015 23:03:54 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 60B71D083E for ; Fri, 26 Jun 2015 23:03:54 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.152 X-Spam-Level: *** X-Spam-Status: No, score=3.152 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=3, URIBL_BLOCKED=0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id D4TW5G65a9vk for ; Fri, 26 Jun 2015 23:03:45 +0000 (UTC) Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com [209.85.212.180]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 720D2205B8 for ; Fri, 26 Jun 2015 23:03:44 +0000 (UTC) Received: by wibdq8 with SMTP id dq8so29401775wib.1 for ; Fri, 26 Jun 2015 16:03:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=RhROunL+mEyjL0UNrsKJcpJylA2s8HIifoccdOJcvuw=; b=ECsxSnSEvWsAgYuPCNktW48Kgbj4I3V4XDw2sRHLVt59J2EiwRR+x/fPtThntBbfNf 3KErv9gV8MrgEPn+G2y2OPT+Hnqwdz2rbma1eI2ovHz6F71/FhT0GJbtWqJTmdnfEdXc DVcL2/A3Ssr1zhKSGw5vNdm0LlYpniP2DzpHdFMN7VzSQK4iWFT4d984hNPMeXX4i/NP mXn95+VUi0CYAbrbGiw80NawfB3lVgQnQBtv5MmlVfBd9Oifw/iy6rce+myDWb3Kqtnj ceihSII8QGHR02a4bj5o/NlIkbno16P1zBSwk28YYORQDlUc+Ptsjf8IYxVbRq1Y7ymQ 34ag== X-Received: by 10.195.13.1 with SMTP id eu1mr6874721wjd.131.1435359823943; Fri, 26 Jun 2015 16:03:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.77.79 with HTTP; Fri, 26 Jun 2015 16:03:24 -0700 (PDT) In-Reply-To: References: From: Behroz Sikander Date: Sat, 27 Jun 2015 01:03:24 +0200 Message-ID: Subject: Re: Groomserer BSPPeerChild limit To: user@hama.apache.org Content-Type: multipart/alternative; boundary=047d7bd9131a6e5bf3051973c0b9 --047d7bd9131a6e5bf3051973c0b9 Content-Type: text/plain; charset=UTF-8 Hi, In the current thread, I mentioned 3 issues. Issue 1 and 3 are resolved but issue number 2 is still giving me headaches. My problem: My cluster now consists of 3 machines. Each one of them properly configured (Apparently). From my master machine when I start Hadoop and Hama, I can see the processes started on other 2 machines. If I check the maximum tasks that my cluster can support then I get 9 (3 tasks on each machine). When I run the PI example, it uses 9 tasks and runs fine. When I run my program with 3 tasks, everything runs fine. But when I increase the tasks (to 4) by using "setNumBspTask". Hama freezes. I do not understand what can go wrong. I checked the logs files and things look fine. I just sometimes get an exception that hama was not able to delete the sytem directory (bsp.system.dir) defined in the hama-site.xml. Any help or clue would be great. Regards, Behroz Sikander On Thu, Jun 25, 2015 at 1:13 PM, Behroz Sikander wrote: > Thank you :) > > On Thu, Jun 25, 2015 at 12:14 AM, Edward J. Yoon > wrote: > >> Hi, >> >> You can get the maximum number of available tasks like following code: >> >> BSPJobClient jobClient = new BSPJobClient(conf); >> ClusterStatus cluster = jobClient.getClusterStatus(true); >> >> // Set to maximum >> bsp.setNumBspTask(cluster.getMaxTasks()); >> >> >> On Wed, Jun 24, 2015 at 11:20 PM, Behroz Sikander >> wrote: >> > Hi, >> > 1) Thank you for this. >> > 2) Here are the images. I will look into the log files of PI example >> > >> > *Result of JPS command on slave* >> > >> http://s17.postimg.org/gpwe2bbfj/Screen_Shot_2015_06_22_at_7_23_31_PM.png >> > >> > *Result of JPS command on Master* >> > >> http://s14.postimg.org/s9922em5p/Screen_Shot_2015_06_22_at_7_23_42_PM.png >> > >> > 3) In my current case, I do not have any input submitted to the job. >> During >> > run time, I directly fetch data from HDFS. So, I am looking for >> something >> > like BSPJob.set*Max*NumBspTask(). >> > >> > Regards, >> > Behroz >> > >> > >> > >> > On Tue, Jun 23, 2015 at 12:57 AM, Edward J. Yoon > > >> > wrote: >> > >> >> Hello, >> >> >> >> 1) You can get the filesystem URI from a configuration using >> >> "FileSystem fs = FileSystem.get(conf);". Of course, the fs.defaultFS >> >> property should be in hama-site.xml >> >> >> >> >> >> fs.defaultFS >> >> hdfs://host1.mydomain.com:9000/ >> >> >> >> The name of the default file system. Either the literal string >> >> "local" or a host:port for HDFS. >> >> >> >> >> >> >> >> 2) The 'bsp.tasks.maximum' is the number of tasks per node. It looks >> >> cluster configuration issue. Please run Pi example and look at the >> >> logs for more details. NOTE: you can not attach the images to mailing >> >> list so I can't see it. >> >> >> >> 3) You can use the BSPJob.setNumBspTask(int) method. If input is >> >> provided, the number of BSP tasks is basically driven by the number of >> >> DFS blocks. I'll fix it to be more flexible on HAMA-956. >> >> >> >> Thanks! >> >> >> >> >> >> On Tue, Jun 23, 2015 at 2:33 AM, Behroz Sikander >> >> wrote: >> >> > Hi, >> >> > Recently, I moved from a single machine setup to a 2 machine setup. >> I was >> >> > successfully able to run my job that uses the HDFS to get data. I >> have 3 >> >> > trivial questions >> >> > >> >> > 1- To access HDFS, I have to manually give the IP address of server >> >> running >> >> > HDFS. I thought that Hama will automatically pick from the >> configurations >> >> > but it does not. I am probably doing something wrong. Right now my >> code >> >> work >> >> > by using the following. >> >> > >> >> > FileSystem fs = FileSystem.get(new URI("hdfs://server_ip:port/"), >> conf); >> >> > >> >> > 2- On my master server, when I start hama it automatically starts >> hama in >> >> > the slave machine (all good). Both master and slave are set as >> >> groomservers. >> >> > This means that I have 2 servers to run my job which means that I can >> >> open >> >> > more BSPPeerChild processes. And if I submit my jar with 3 bsp tasks >> then >> >> > everything works fine. But when I move to 4 tasks, Hama freezes. >> Here is >> >> the >> >> > result of JPS command on slave. >> >> > >> >> > >> >> > Result of JPS command on Master >> >> > >> >> > >> >> > >> >> > You can see that it is only opening tasks on slaves but not on >> master. >> >> > >> >> > Note: I tried to change the bsp.tasks.maximum property in >> >> hama-default.xml >> >> > to 4 but still same result. >> >> > >> >> > 3- I want my cluster to open as many BSPPeerChild processes as >> possible. >> >> Is >> >> > there any setting that can I do to achieve that ? Or hama picks up >> the >> >> > values from hama-default.xml to open tasks ? >> >> > >> >> > >> >> > Regards, >> >> > >> >> > Behroz Sikander >> >> >> >> >> >> >> >> -- >> >> Best Regards, Edward J. Yoon >> >> >> >> >> >> -- >> Best Regards, Edward J. Yoon >> > > --047d7bd9131a6e5bf3051973c0b9--