Mailing-List: contact dev-help@samza.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@samza.incubator.apache.org
Received-SPF: softfail (nike.apache.org: transitioning domain of
 mark.mindenhall@machineshop.io does not designate 157.56.110.141 as permitted
 sender)
From: Mark Mindenhall <mark.mindenhall@machineshop.io>
To: "dev@samza.incubator.apache.org" <dev@samza.incubator.apache.org>
Subject: Re: Problems running new jobs in hello-samza
Thread-Topic: Problems running new jobs in hello-samza
Thread-Index: AQHP4WD4vex58DH5ekS/PV10zql3mZwjNf+A
Date: Mon, 6 Oct 2014 15:44:14 +0000
Message-ID: <1E76062F-83EB-4A8E-A5FF-7CE149B62E7E@machineshop.io>
References: 
 <CANjo42zQit-WhvAOAjUe4AULpKrU9gFJGGw2zp3yQvxSFQ2_dg@mail.gmail.com>
In-Reply-To: 
 <CANjo42zQit-WhvAOAjUe4AULpKrU9gFJGGw2zp3yQvxSFQ2_dg@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative;
	boundary="_000_1E76062F83EB4A8EA5FF7CE149B62E7Emachineshopio_"
MIME-Version: 1.0

--_000_1E76062F83EB4A8EA5FF7CE149B62E7Emachineshopio_
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

Hi Zach,

I=92m also a relative newbie, but I did run into this same issue.  You are =
correct, in that your 5th job isn=92t starting due to not enough resources =
available in the cluster, so you need to reduce the resources required.

First, in yarn-site.xml I switched over to the FairScheduler<http://hadoop.=
apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html>:

  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fai=
rScheduler</value>
  </property>

I also added these two properties (yarn-site.xml) to control the amount of =
memory allocated to each job:

  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>256</value>
    <description>Minimum limit of memory to allocate to each container requ=
est at the Resource Manager.</description>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>512</value>
    <description>Maximum limit of memory to allocate to each container requ=
est at the Resource Manager.</description>
  </property>

Then, in each of my Samza properties files describing my jobs, I added the =
following two settings:

    yarn.container.memory.mb=3D512
    yarn.am.container.memory.mb=3D256

Hope that helps!

Best,
Mark


On Oct 6, 2014, at 6:27 AM, Zach Cox <zcox522@gmail.com<mailto:zcox522@gmai=
l.com>> wrote:

Hi - I'm just getting started with Samza. I got the hello-samza example
working properly in the vagrant box. Then I wrote 2 new tasks, rebuilt
everything and submitted them to yarn using run-job.sh. These 2 new jobs
show up in the yarn web ui, however only one of them has State=3DRUNNING, t=
he
other just sits forever at State=3DACCEPTED.

The Cluster Metrics section shows some interesting things:
- Apps Pending =3D 1
- Apps Running =3D 4
- Containers Running =3D 8
- Memory Used =3D 8 GB
- Memory Total =3D 8 GB
- Memory Reserved =3D 0 B

Again I'm really new to samza & yarn, but does this mean that the node on
this vagrant box has 8 GB memory available but all 8 GB is being used, so
it can't run the 5th samza job?

Are there 8 containers running because each Samza job has an
ApplicationMaster and a SamzaContainer? Are each of those containers using
1 GB memory, and that's why all the available memory is used up? Do these
containers really need 1 GB memory each? Can this be adjusted somehow?

Just trying to better understand what's going on here, and see if there's a
simple way to get both of my new tasks running in hello-samza.

Thanks,
Zach


--_000_1E76062F83EB4A8EA5FF7CE149B62E7Emachineshopio_--