Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@flink.apache.org
From: Ken Krugler <kkrugler_lists@transpac.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_F3A4364E-246C-4F22-8CE9-DE8D854501B1"
Message-Id: <C8F865A9-710F-4A87-8CB0-5957D7719A37@transpac.com>
Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\))
Subject: Re: Reducing parallelism leads to NoResourceAvailableException
Date: Thu, 28 Apr 2016 08:31:52 -0700
References: <1585AA4A-D2D4-4E11-B25B-37AC07B4C0BE@transpac.com>
 <CAKiyyaGc_UyiOSvSNEWu9S+cXefc-rnnNQgAdAt7oA5bMRb98g@mail.gmail.com>
To: user@flink.apache.org
In-Reply-To: 
 <CAKiyyaGc_UyiOSvSNEWu9S+cXefc-rnnNQgAdAt7oA5bMRb98g@mail.gmail.com>


--Apple-Mail=_F3A4364E-246C-4F22-8CE9-DE8D854501B1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Hi Ufuk,

> On Apr 28, 2016, at 1:32am, Ufuk Celebi <uce@apache.org> wrote:
>=20
> Hey Ken!
>=20
> That should not happen. Can you check the web interface for two =
things:
>=20
> - How many available slots are advertized on the landing page
> (localhost:8081) when you submit your job?

I=E2=80=99m running this on YARN, so I don=E2=80=99t believe the web UI =
shows up until the Flink AppManager has been started, which means I =
don=E2=80=99t know the advertised number of available slots before the =
job is running.

> - Can you check the actual parallelism of the submitted job (it should
> appear as a FAILED job in the web frontend). Is it really 15?

Same as above, the Flink web UI is gone once the job has failed.

Any suggestions for how to check the actual parallelism in this type of =
transient YARN environment?

Thanks,

=E2=80=94 Ken


> On Thu, Apr 28, 2016 at 12:52 AM, Ken Krugler
> <kkrugler_lists@transpac.com> wrote:
>> Hi all,
>>=20
>> In trying out different settings for performance, I run into a job =
failure
>> case that puzzles me.
>>=20
>> I=E2=80=99d done a run with a parallelism of 20 (-p 20 via CLI), and =
the job ran
>> successfully, on a cluster with 40 slots.
>>=20
>> I then tried with -p 15, and it failed with:
>>=20
>> NoResourceAvailableException: Not enough free slots available to run =
the
>> job. You can decrease the operator parallelism=E2=80=A6
>>=20
>> But the change was to reduce parallelism - why would that now cause =
this
>> problem?
>>=20
>> Thanks,
>>=20
>> =E2=80=94 Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr


--Apple-Mail=_F3A4364E-246C-4F22-8CE9-DE8D854501B1
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D"">Hi Ufuk,<div class=3D""><br class=3D""><div><blockquote =
type=3D"cite" class=3D""><div class=3D"">On Apr 28, 2016, at 1:32am, =
Ufuk Celebi &lt;<a href=3D"mailto:uce@apache.org" =
class=3D"">uce@apache.org</a>&gt; wrote:</div><br =
class=3D"Apple-interchange-newline"><div class=3D""><div class=3D"">Hey =
Ken!<br class=3D""><br class=3D"">That should not happen. Can you check =
the web interface for two things:<br class=3D""><br class=3D"">- How =
many available slots are advertized on the landing page<br =
class=3D"">(localhost:8081) when you submit your job?<br =
class=3D""></div></div></blockquote><div><div><br =
class=3D""></div><div>I=E2=80=99m running this on YARN, so I don=E2=80=99t=
 believe the web UI shows up until the Flink AppManager has been =
started, which means I don=E2=80=99t know the advertised number of =
available slots before the job is running.</div></div></div><div><br =
class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div =
class=3D"">- Can you check the actual parallelism of the submitted job =
(it should<br class=3D"">appear as a FAILED job in the web frontend). Is =
it really 15?<br class=3D""></div></div></blockquote><div><br =
class=3D""></div></div><div>Same as above, the Flink web UI is gone once =
the job has failed.</div><div><br class=3D""></div><div>Any suggestions =
for how to check the actual parallelism in this type of transient YARN =
environment?</div><div><br class=3D""></div><div>Thanks,</div><div><br =
class=3D""></div><div>=E2=80=94 Ken</div><div><br =
class=3D""></div><div><br class=3D""><blockquote type=3D"cite" =
class=3D""><div class=3D""><div class=3D"">On Thu, Apr 28, 2016 at 12:52 =
AM, Ken Krugler<br class=3D"">&lt;<a =
href=3D"mailto:kkrugler_lists@transpac.com" =
class=3D"">kkrugler_lists@transpac.com</a>&gt; wrote:<br =
class=3D""><blockquote type=3D"cite" class=3D"">Hi all,<br class=3D""><br =
class=3D"">In trying out different settings for performance, I run into =
a job failure<br class=3D"">case that puzzles me.<br class=3D""><br =
class=3D"">I=E2=80=99d done a run with a parallelism of 20 (-p 20 via =
CLI), and the job ran<br class=3D"">successfully, on a cluster with 40 =
slots.<br class=3D""><br class=3D"">I then tried with -p 15, and it =
failed with:<br class=3D""><br class=3D"">NoResourceAvailableException: =
Not enough free slots available to run the<br class=3D"">job. You can =
decrease the operator parallelism=E2=80=A6<br class=3D""><br =
class=3D"">But the change was to reduce parallelism - why would that now =
cause this<br class=3D"">problem?<br class=3D""><br class=3D"">Thanks,<br =
class=3D""><br class=3D"">=E2=80=94 Ken<br =
class=3D""></blockquote></div></div></blockquote></div><br class=3D""><div=
 class=3D"">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: =
none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: =
0px; -webkit-text-decorations-in-effect: none; =
-webkit-text-stroke-width: 0px;"><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><span class=3D"Apple-style-span" style=3D"border-collapse: =
separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: =
normal; font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: =
none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: =
0px; -webkit-text-decorations-in-effect: none; =
-webkit-text-stroke-width: 0px;"><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><span class=3D"Apple-style-span" style=3D"border-collapse: =
separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: =
normal; font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: =
0px;"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: =
0px;"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: =
0px;"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: =
0px;"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: =
0px;"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><div class=3D""><div =
class=3D"">--------------------------</div><div class=3D""><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: =
0px;"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: =
0px;"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant-ligatures: normal; font-variant-position: normal; =
font-variant-caps: normal; font-variant-numeric: normal; =
font-variant-alternates: normal; font-variant-east-asian: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; border-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: =
0px;"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space;" class=3D""><div class=3D"">Ken =
Krugler</div><div class=3D"">+1 530-210-6378</div><div class=3D""><a =
href=3D"http://www.scaleunlimited.com" =
class=3D"">http://www.scaleunlimited.com</a></div><div class=3D"">custom =
big data solutions &amp; training</div><div class=3D"">Hadoop, =
Cascading, Cassandra &amp; =
Solr</div></div></span></div></span></div></span></div></div></div></div><=
/span></div></span></div></span></div></span></div></span></div></span></d=
iv></span><br class=3D"Apple-interchange-newline"><br =
class=3D"Apple-interchange-newline">
</div>

<br class=3D""></div></body></html>=

--Apple-Mail=_F3A4364E-246C-4F22-8CE9-DE8D854501B1--