Mailing-List: contact user-help@flink.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@flink.apache.org
Date: Tue, 13 Sep 2016 18:14:16 +0000 (UTC)
From: amir bahmanyari <amirtousa@yahoo.com>
Reply-To: amir bahmanyari <amirtousa@yahoo.com>
To: "user@flink.apache.org" <user@flink.apache.org>
Message-ID: <1791313168.1968798.1473790456784@mail.yahoo.com>
In-Reply-To: <CAGr9p8C-yT6EB9o-V6apUoDeUyFQQOPHV8kbKPceAD2xyeC-iw@mail.gmail.com>
References: <485614424.1525380.1473741336909.ref@mail.yahoo.com> <485614424.1525380.1473741336909@mail.yahoo.com> <CAGr9p8C-yT6EB9o-V6apUoDeUyFQQOPHV8kbKPceAD2xyeC-iw@mail.gmail.com>
Subject: Fw: Flink Cluster Load Distribution Question
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_1968797_820934192.1473790456772"
archived-at: Tue, 13 Sep 2016 18:14:33 -0000

------=_Part_1968797_820934192.1473790456772
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi Robert,Sure, I am forwarding it to user. Sorry about that. I followed th=
e "robot's" instructions :))Topology: 4 Azure A11 CentOS 7 nodes (16 cores,=
 110 GB). Lets call them node1, 2, 3, 4.Flink Clustered with node1 running =
JM & a TM. Three more TM's running on node2,3, and 4 respectively.I have a =
Beam running FLink Runner underneath.The input data is received by Beam Tex=
tIO() reading off a 1.6 GB of data containing roughly 22 million tuples.All=
 nodes have identical=C2=A0flink-conf.yaml, masters & slaves contents as fo=
llows:
flink-conf.yaml:
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0jobmanager.rpc.address: nod=
e1  jobmanager.rpc.port: 6123 jobmanager.heap.mb: 1024 taskmanager.heap.mb:=
 102400 taskmanager.numberOfTaskSlots: 16  taskmanager.memory.preallocate: =
false parallelism.default: 64 jobmanager.web.port: 8081 taskmanager.network=
.numberOfBuffers: 4096


    masters:=C2=A0node1:8081
slaves:node1node2
node3
node4

Everything looks normal at ./start-cluster.sh & all daemons start on all no=
des.JM, TMs log files get generated on all nodes.Dashboard shows how all sl=
ots are being used.I deploy the Beam app to the cluster where JM is running=
 at node1.a *.out file gets generated as data is being processed. No *.out =
on other nodes, just node1 where I deployed the fat jar.I tail -f the *.out=
 log on node1 (master). starts fine...but slowly degrades & becomes extreme=
ly slow.As we speak, I started the Beam app 13 hrs ago and its still runnin=
g.How can I prove that ALL NODES are involved in processing the data at the=
 same time i.e. clustered?Do the above configurations look ok for a reasona=
ble performance?Given above parameters set, how can I improve the performan=
ce in this cluster?What other information and or dashboard screen shots is =
needed to clarify this issue.=C2=A0I used these websites to do the configur=
ation:Apache Flink: Cluster Setup

 =20
| =20
|   | =20
Apache Flink: Cluster Setup
   |  |

  |

=20

Apache Flink: Configuration


 =20
| =20
|   | =20
Apache Flink: Configuration
   |  |

  |

=20
In the second link, there is a config recommendation for the following but =
this parameter is not in the configuration file out of the box:  =20
   - taskmanager.network.bufferSizeInBytes
Should I include it manually? Does it make any difference if the default va=
lue i.e.32 KB doesn't get picked up?Sorry too many questions.Pls let me kno=
w.I appreciate your help.Cheers,Amir-
----- Forwarded Message -----
 From: Robert Metzger <rmetzger@apache.org>
 To: "dev@flink.apache.org" <dev@flink.apache.org>; amir bahmanyari <amirto=
usa@yahoo.com>=20
 Sent: Tuesday, September 13, 2016 1:15 AM
 Subject: Re: Flink Cluster Load Distribution Question
  =20
Hi Amir,

I would recommend to post such questions to the user@flink mailing list in
the future. This list is meant for development-related topics.

I think we need more details to understand why your application is not
running properly. Can you quickly describe what your topology is doing?
Are you setting the parallelism to a value >=3D 1 ?

Regards,
Robert


On Tue, Sep 13, 2016 at 6:35 AM, amir bahmanyari <
amirtousa@yahoo.com.invalid> wrote:

> Hi Colleagues,Just joined this forum.I have done everything possible to
> get a 4 nodes Flink cluster to work peoperly & run a Beam app.It always
> generates system-output logs (*.out) in only one node. Its sooooooooo slo=
w
> for 4 nodes being there.Seems like the load is not distributed amongst al=
l
> 4 nodes but only one node. Most of the time the one where JM runs.I
> run/tested it in a single node, and it took even faster to run the same
> load.Not sure whats not being configured right.1- why am I getting
> SystemOut .out log in only one server? All nodes get their TaskManager lo=
g
> files updated thu.2- why dont I see load being distributed amongst all 4
> nodes, but only one all the times.3- Why does the Dashboard show a 0 (zer=
o)
> for Send/Receive numbers per all Task Managers.
> The Dashboard shows all the right stuff. Top shows not much of resources
> being stressed on any of the nodes.I can share its contents if it helps
> diagnosing the issue.Thanks + I appreciate your valuable time, response &
> help.Amir-


  =20
------=_Part_1968797_820934192.1473790456772
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html xmlns=3D"http://www.w3.org/1999/xhtml" xmlns:v=3D"urn:schemas-microso=
ft-com:vml" xmlns:o=3D"urn:schemas-microsoft-com:office:office"><head><!--[=
if gte mso 9]><xml><o:OfficeDocumentSettings><o:AllowPNG/><o:PixelsPerInch>=
96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]--></head><bo=
dy><div style=3D"color:#000; background-color:#fff; font-family:HelveticaNe=
ue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:1=
2px"><div id=3D"yui_3_16_0_ym19_1_1473566900689_178816"><span>Hi Robert,</s=
pan></div><div id=3D"yui_3_16_0_ym19_1_1473566900689_178816" dir=3D"ltr"><s=
pan id=3D"yui_3_16_0_ym19_1_1473566900689_178917">Sure, I am forwarding it =
to user. Sorry about that. I followed the "robot's" instructions :))</span>=
</div><div id=3D"yui_3_16_0_ym19_1_1473566900689_178816" dir=3D"ltr">Topolo=
gy: 4 Azure A11 CentOS 7 nodes (16 cores, 110 GB). Lets call them node1, 2,=
 3, 4.</div><div id=3D"yui_3_16_0_ym19_1_1473566900689_178816" dir=3D"ltr">=
Flink Clustered with node1 running JM &amp; a TM. Three more TM's running o=
n node2,3, and 4 respectively.</div><div id=3D"yui_3_16_0_ym19_1_1473566900=
689_178816" dir=3D"ltr">I have a Beam running FLink Runner underneath.</div=
><div id=3D"yui_3_16_0_ym19_1_1473566900689_178816" dir=3D"ltr">The input d=
ata is received by Beam TextIO() reading off a 1.6 GB of data containing ro=
ughly 22 million tuples.</div><div id=3D"yui_3_16_0_ym19_1_1473566900689_17=
8816" dir=3D"ltr"><u id=3D"yui_3_16_0_ym19_1_1473566900689_179638"><b>All n=
odes </b>have identical&nbsp;flink-conf.yam</u>l, masters &amp; slaves cont=
ents as follows:</div><div id=3D"yui_3_16_0_ym19_1_1473566900689_178816" di=
r=3D"ltr"><br></div><div id=3D"yui_3_16_0_ym19_1_1473566900689_178816" dir=
=3D"ltr"><b id=3D"yui_3_16_0_ym19_1_1473566900689_179636">flink-conf.yaml:<=
/b><br></div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179351"=
>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;jobmanager.rpc.address: no=
de1<span style=3D"white-space:pre-wrap;" id=3D"yui_3_16_0_ym19_1_1473566900=
689_179352">=09</span></div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_147356=
6900689_179353"><span style=3D"white-space:pre-wrap;" id=3D"yui_3_16_0_ym19=
_1_1473566900689_179354">=09</span>jobmanager.rpc.port: 6123</div><div dir=
=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179355"><span style=3D"white=
-space:pre-wrap;" id=3D"yui_3_16_0_ym19_1_1473566900689_179356">=09</span>j=
obmanager.heap.mb: 1024</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_14735=
66900689_179357"><span style=3D"white-space:pre-wrap;" id=3D"yui_3_16_0_ym1=
9_1_1473566900689_179358">=09</span>taskmanager.heap.mb: 102400</div><div d=
ir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179359"><span style=3D"whi=
te-space:pre-wrap;" id=3D"yui_3_16_0_ym19_1_1473566900689_179360">=09</span=
>taskmanager.numberOfTaskSlots: 16<span style=3D"white-space:pre-wrap;" id=
=3D"yui_3_16_0_ym19_1_1473566900689_179361">=09</span></div><div dir=3D"ltr=
" id=3D"yui_3_16_0_ym19_1_1473566900689_179362"><span style=3D"white-space:=
pre-wrap;" id=3D"yui_3_16_0_ym19_1_1473566900689_179363">=09</span>taskmana=
ger.memory.preallocate: false</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1=
_1473566900689_179364"><span style=3D"white-space:pre-wrap;" id=3D"yui_3_16=
_0_ym19_1_1473566900689_179365">=09</span>parallelism.default: 64</div><div=
 dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179366"><span style=3D"w=
hite-space:pre-wrap;" id=3D"yui_3_16_0_ym19_1_1473566900689_179367">=09</sp=
an>jobmanager.web.port: 8081</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_=
1473566900689_179368"><span style=3D"white-space:pre-wrap;" id=3D"yui_3_16_=
0_ym19_1_1473566900689_179369">=09</span>taskmanager.network.numberOfBuffer=
s: 4096</div><div id=3D"yui_3_16_0_ym19_1_1473566900689_178816" dir=3D"ltr"=
><br></div><div class=3D"qtdSeparateBR"><br><br></div><div class=3D"yahoo_q=
uoted" id=3D"yui_3_16_0_ym19_1_1473566900689_177686" style=3D"display: bloc=
k;">  <div id=3D"yui_3_16_0_ym19_1_1473566900689_177685"> <div id=3D"yui_3_=
16_0_ym19_1_1473566900689_177684"> <div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1=
_1473566900689_177683" style=3D"font-family: HelveticaNeue, &quot;Helvetica=
 Neue&quot;, Helvetica, Arial, &quot;Lucida Grande&quot;, sans-serif; font-=
size: 16px;"> <font size=3D"2" face=3D"Arial" id=3D"yui_3_16_0_ym19_1_14735=
66900689_177682"><b>masters</b>:&nbsp;</font></div><div dir=3D"ltr" id=3D"y=
ui_3_16_0_ym19_1_1473566900689_177683"><font size=3D"2" face=3D"Arial" id=
=3D"yui_3_16_0_ym19_1_1473566900689_179488">node1:8081</font></div><div dir=
=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_177683"><font size=3D"2" fac=
e=3D"Arial"><br></font></div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_14735=
66900689_177683"><font size=3D"2" face=3D"Arial" id=3D"yui_3_16_0_ym19_1_14=
73566900689_179492"><b>slaves</b>:</font></div><div dir=3D"ltr" id=3D"yui_3=
_16_0_ym19_1_1473566900689_177683"><font size=3D"2" face=3D"Arial" id=3D"yu=
i_3_16_0_ym19_1_1473566900689_179499"><div dir=3D"ltr" id=3D"yui_3_16_0_ym1=
9_1_1473566900689_179562">node1</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19=
_1_1473566900689_179563">node2<br id=3D"yui_3_16_0_ym19_1_1473566900689_179=
600"></div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179564">n=
ode3<br id=3D"yui_3_16_0_ym19_1_1473566900689_179603"></div><div dir=3D"ltr=
" id=3D"yui_3_16_0_ym19_1_1473566900689_179565">node4<br id=3D"yui_3_16_0_y=
m19_1_1473566900689_179606"></div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_=
1473566900689_179566"><br id=3D"yui_3_16_0_ym19_1_1473566900689_179567"></d=
iv><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">Everythin=
g looks normal at ./start-cluster.sh &amp; all daemons start on all nodes.<=
/div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">JM, TMs=
 log files get generated on all nodes.</div><div dir=3D"ltr" id=3D"yui_3_16=
_0_ym19_1_1473566900689_179566">Dashboard shows how all slots are being use=
d.</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">I de=
ploy the Beam app to the cluster where JM is running at node1.</div><div di=
r=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">a *.out file gets g=
enerated as data is being processed. No *.out on other nodes, just node1 wh=
ere I deployed the fat jar.</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1=
473566900689_179566">I tail -f the *.out log on node1 (master). starts fine=
...but slowly degrades &amp; becomes extremely slow.</div><div dir=3D"ltr" =
id=3D"yui_3_16_0_ym19_1_1473566900689_179566">As we speak, I started the Be=
am app 13 hrs ago and its still running.</div><div dir=3D"ltr" id=3D"yui_3_=
16_0_ym19_1_1473566900689_179566">How can I prove that ALL NODES are involv=
ed in processing the data at the same time i.e. clustered?</div><div dir=3D=
"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">Do the above configurat=
ions look ok for a reasonable performance?</div><div dir=3D"ltr" id=3D"yui_=
3_16_0_ym19_1_1473566900689_179566">Given above parameters set, how can I i=
mprove the performance in this cluster?</div><div dir=3D"ltr" id=3D"yui_3_1=
6_0_ym19_1_1473566900689_179566">What other information and or dashboard sc=
reen shots is needed to clarify this issue.&nbsp;</div><div dir=3D"ltr" id=
=3D"yui_3_16_0_ym19_1_1473566900689_179566">I used these websites to do the=
 configuration:</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689=
_179566"><a href=3D"https://ci.apache.org/projects/flink/flink-docs-release=
-0.8/cluster_setup.html" class=3D"enhancr2_85efd7f2-ab46-7dbe-ea3f-f28502b4=
ece3" id=3D"yui_3_16_0_ym19_1_1473566900689_180626">Apache Flink: Cluster S=
etup</a><br></div><div id=3D"yui_3_16_0_ym19_1_1473566900689_180709"><br></=
div><div id=3D"enhancr2_85efd7f2-ab46-7dbe-ea3f-f28502b4ece3" class=3D"yaho=
o-link-enhancr-card yahoo-link-enhancr-not-allow-cover ymail-preserve-class=
 ymail-preserve-style" style=3D"max-width:400px;font-family:'Helvetica Neue=
', Helvetica, Arial, sans-serif;" contenteditable=3D"false" data-url=3D"htt=
ps://ci.apache.org/projects/flink/flink-docs-release-0.8/cluster_setup.html=
" data-type=3D"yenhancr" data-category=3D"article" data-embed-url=3D"" data=
-size=3D"medium" dir=3D"ltr"> <a href=3D"https://ci.apache.org/projects/fli=
nk/flink-docs-release-0.8/cluster_setup.html" style=3D"text-decoration:none=
 !important; color: #000 !important" class=3D"yahoo-enhancr-cardlink" targe=
t=3D"_blank" rel=3D"noreferrer" id=3D"yui_3_16_0_ym19_1_1473566900689_18068=
9"> <table class=3D"card-wrapper" cellpadding=3D"0" cellspacing=3D"0" borde=
r=3D"0" style=3D"max-width:400px;" id=3D"yui_3_16_0_ym19_1_1473566900689_18=
0688"> <tbody id=3D"yui_3_16_0_ym19_1_1473566900689_180687"><tr id=3D"yui_3=
_16_0_ym19_1_1473566900689_180686"> <td width=3D"400" id=3D"yui_3_16_0_ym19=
_1_1473566900689_180685"> <table class=3D"card-info" cellpadding=3D"0" cell=
spacing=3D"0" border=3D"0" width=3D"100%" style=3D"background:#fff;max-widt=
h:400px;border:1px solid #e0e4e9;border-bottom:3px solid #000000;" id=3D"yu=
i_3_16_0_ym19_1_1473566900689_180684"> <tbody id=3D"yui_3_16_0_ym19_1_14735=
66900689_180698"><tr id=3D"yui_3_16_0_ym19_1_1473566900689_180697"> <td sty=
le=3D"vertical-align:top;padding:16px 0 16px 12px;">  </td> <td style=3D"ve=
rtical-align:middle;padding:16px 12px;width:99%;" id=3D"yui_3_16_0_ym19_1_1=
473566900689_180696"> <h2 class=3D"card-title" style=3D"margin:0 0 4px 0;fo=
nt-size:16px;font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;wor=
d-break:break-word;" id=3D"yui_3_16_0_ym19_1_1473566900689_180707">Apache F=
link: Cluster Setup</h2>  <div class=3D"card-description" style=3D"font-siz=
e:11px;line-height:15px;color:#999;word-break:break-word;"></div> </td> <td=
 class=3D"card-share-container"></td> </tr> </tbody></table> </td> </tr> </=
tbody></table> </a></div><div id=3D"yui_3_16_0_ym19_1_1473566900689_180634"=
><br></div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566"><=
br></div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566"><a =
href=3D"https://ci.apache.org/projects/flink/flink-docs-release-0.8/config.=
html" class=3D"enhancr2_d15e6bbe-e02d-de22-8c6d-76ea90cdaf33" id=3D"yui_3_1=
6_0_ym19_1_1473566900689_180718">Apache Flink: Configuration</a><br></div><=
div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566"><br></div><di=
v id=3D"yui_3_16_0_ym19_1_1473566900689_180746"><br></div><div id=3D"enhanc=
r2_d15e6bbe-e02d-de22-8c6d-76ea90cdaf33" class=3D"yahoo-link-enhancr-card y=
ahoo-link-enhancr-not-allow-cover ymail-preserve-class ymail-preserve-style=
" style=3D"max-width:400px;font-family:'Helvetica Neue', Helvetica, Arial, =
sans-serif;" contenteditable=3D"false" data-url=3D"https://ci.apache.org/pr=
ojects/flink/flink-docs-release-0.8/config.html" data-type=3D"yenhancr" dat=
a-category=3D"article" data-embed-url=3D"" data-size=3D"medium" dir=3D"ltr"=
> <a href=3D"https://ci.apache.org/projects/flink/flink-docs-release-0.8/co=
nfig.html" style=3D"text-decoration:none !important; color: #000 !important=
" class=3D"yahoo-enhancr-cardlink" target=3D"_blank" rel=3D"noreferrer" id=
=3D"yui_3_16_0_ym19_1_1473566900689_180735"> <table class=3D"card-wrapper" =
cellpadding=3D"0" cellspacing=3D"0" border=3D"0" style=3D"max-width:400px;"=
 id=3D"yui_3_16_0_ym19_1_1473566900689_180734"> <tbody id=3D"yui_3_16_0_ym1=
9_1_1473566900689_180733"><tr id=3D"yui_3_16_0_ym19_1_1473566900689_180732"=
> <td width=3D"400" id=3D"yui_3_16_0_ym19_1_1473566900689_180731"> <table c=
lass=3D"card-info" cellpadding=3D"0" cellspacing=3D"0" border=3D"0" width=
=3D"100%" style=3D"background:#fff;max-width:400px;border:1px solid #e0e4e9=
;border-bottom:3px solid #000000;" id=3D"yui_3_16_0_ym19_1_1473566900689_18=
0730"> <tbody id=3D"yui_3_16_0_ym19_1_1473566900689_180729"><tr id=3D"yui_3=
_16_0_ym19_1_1473566900689_180728"> <td style=3D"vertical-align:top;padding=
:16px 0 16px 12px;">  </td> <td style=3D"vertical-align:middle;padding:16px=
 12px;width:99%;" id=3D"yui_3_16_0_ym19_1_1473566900689_180727"> <h2 class=
=3D"card-title" style=3D"margin:0 0 4px 0;font-size:16px;font-family:'Helve=
tica Neue', Helvetica, Arial, sans-serif;word-break:break-word;">Apache Fli=
nk: Configuration</h2>  <div class=3D"card-description" style=3D"font-size:=
11px;line-height:15px;color:#999;word-break:break-word;"></div> </td> <td c=
lass=3D"card-share-container"></td> </tr> </tbody></table> </td> </tr> </tb=
ody></table> </a></div><div id=3D"yui_3_16_0_ym19_1_1473566900689_181019"><=
br></div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">In =
the second link, there is a config recommendation for the following but thi=
s parameter is not in the configuration file out of the box:</div><ul dir=
=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_180853"><li id=3D"yui_3_16_0=
_ym19_1_1473566900689_180854"><code id=3D"yui_3_16_0_ym19_1_1473566900689_1=
80855">taskmanager.network.bufferSizeInBytes</code></li></ul><div dir=3D"lt=
r" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">Should I include it manual=
ly? Does it make any difference if the default value i.e.32 KB doesn't get =
picked up?</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_1795=
66">Sorry too many questions.</div><div dir=3D"ltr" id=3D"yui_3_16_0_ym19_1=
_1473566900689_179566">Pls let me know.</div><div dir=3D"ltr" id=3D"yui_3_1=
6_0_ym19_1_1473566900689_179566">I appreciate your help.</div><div dir=3D"l=
tr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">Cheers,</div><div dir=3D"=
ltr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566">Amir-</div><div dir=3D"l=
tr" id=3D"yui_3_16_0_ym19_1_1473566900689_179566"><br></div><font face=3D"H=
elveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif" =
id=3D"yui_3_16_0_ym19_1_1473566900689_179447"><span style=3D"font-size: 16p=
x;" id=3D"yui_3_16_0_ym19_1_1473566900689_179448">----- Forwarded Message -=
----</span></font><br> <b style=3D"font-family: HelveticaNeue, &quot;Helvet=
ica Neue&quot;, Helvetica, Arial, &quot;Lucida Grande&quot;, sans-serif; fo=
nt-size: 16px;" id=3D"yui_3_16_0_ym19_1_1473566900689_180712"><span style=
=3D"font-weight:bold;" id=3D"yui_3_16_0_ym19_1_1473566900689_180711">From:<=
/span></b><font face=3D"HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lu=
cida Grande, sans-serif" id=3D"yui_3_16_0_ym19_1_1473566900689_179449"><spa=
n style=3D"font-size: 16px;" id=3D"yui_3_16_0_ym19_1_1473566900689_179450">=
 Robert Metzger &lt;rmetzger@apache.org&gt;</span></font><br> <b style=3D"f=
ont-family: HelveticaNeue, &quot;Helvetica Neue&quot;, Helvetica, Arial, &q=
uot;Lucida Grande&quot;, sans-serif; font-size: 16px;"><span style=3D"font-=
weight: bold;">To:</span></b><font face=3D"HelveticaNeue, Helvetica Neue, H=
elvetica, Arial, Lucida Grande, sans-serif" id=3D"yui_3_16_0_ym19_1_1473566=
900689_179451"><span style=3D"font-size: 16px;" id=3D"yui_3_16_0_ym19_1_147=
3566900689_179452"> "dev@flink.apache.org" &lt;dev@flink.apache.org&gt;; am=
ir bahmanyari &lt;amirtousa@yahoo.com&gt; </span></font><br> <b style=3D"fo=
nt-family: HelveticaNeue, &quot;Helvetica Neue&quot;, Helvetica, Arial, &qu=
ot;Lucida Grande&quot;, sans-serif; font-size: 16px;" id=3D"yui_3_16_0_ym19=
_1_1473566900689_180925"><span style=3D"font-weight: bold;" id=3D"yui_3_16_=
0_ym19_1_1473566900689_180924">Sent:</span></b><font face=3D"HelveticaNeue,=
 Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif" id=3D"yui_3_1=
6_0_ym19_1_1473566900689_179453"><span style=3D"font-size: 16px;" id=3D"yui=
_3_16_0_ym19_1_1473566900689_179454"> Tuesday, September 13, 2016 1:15 AM</=
span></font><br> <b style=3D"font-family: HelveticaNeue, &quot;Helvetica Ne=
ue&quot;, Helvetica, Arial, &quot;Lucida Grande&quot;, sans-serif; font-siz=
e: 16px;"><span style=3D"font-weight: bold;">Subject:</span></b><font face=
=3D"HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-se=
rif" id=3D"yui_3_16_0_ym19_1_1473566900689_179455"><span style=3D"font-size=
: 16px;" id=3D"yui_3_16_0_ym19_1_1473566900689_179456"> Re: Flink Cluster L=
oad Distribution Question</span></font><br> </font> </div> <div class=3D"y_=
msg_container" id=3D"yui_3_16_0_ym19_1_1473566900689_178854" style=3D"font-=
family: HelveticaNeue, &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;=
Lucida Grande&quot;, sans-serif; font-size: 16px;"><br>Hi Amir,<br clear=3D=
"none"><br clear=3D"none">I would recommend to post such questions to the <=
a shape=3D"rect" ymailto=3D"mailto:user@flink" href=3D"mailto:user@flink" i=
d=3D"yui_3_16_0_ym19_1_1473566900689_179503">user@flink</a> mailing list in=
<br clear=3D"none">the future. This list is meant for development-related t=
opics.<br clear=3D"none"><br clear=3D"none">I think we need more details to=
 understand why your application is not<br clear=3D"none">running properly.=
 Can you quickly describe what your topology is doing?<br clear=3D"none">Ar=
e you setting the parallelism to a value &gt;=3D 1 ?<br clear=3D"none"><br =
clear=3D"none">Regards,<br clear=3D"none">Robert<br clear=3D"none"><br clea=
r=3D"none"><div class=3D"yqt1130173342" id=3D"yqtfd74231"><br clear=3D"none=
">On Tue, Sep 13, 2016 at 6:35 AM, amir bahmanyari &lt;<br clear=3D"none"><=
a shape=3D"rect" ymailto=3D"mailto:amirtousa@yahoo.com.invalid" href=3D"mai=
lto:amirtousa@yahoo.com.invalid">amirtousa@yahoo.com.invalid</a>&gt; wrote:=
<br clear=3D"none"><br clear=3D"none">&gt; Hi Colleagues,Just joined this f=
orum.I have done everything possible to<br clear=3D"none">&gt; get a 4 node=
s Flink cluster to work peoperly &amp; run a Beam app.It always<br clear=3D=
"none">&gt; generates system-output logs (*.out) in only one node. Its sooo=
oooooo slow<br clear=3D"none">&gt; for 4 nodes being there.Seems like the l=
oad is not distributed amongst all<br clear=3D"none">&gt; 4 nodes but only =
one node. Most of the time the one where JM runs.I<br clear=3D"none">&gt; r=
un/tested it in a single node, and it took even faster to run the same<br c=
lear=3D"none">&gt; load.Not sure whats not being configured right.1- why am=
 I getting<br clear=3D"none">&gt; SystemOut .out log in only one server? Al=
l nodes get their TaskManager log<br clear=3D"none">&gt; files updated thu.=
2- why dont I see load being distributed amongst all 4<br clear=3D"none">&g=
t; nodes, but only one all the times.3- Why does the Dashboard show a 0 (ze=
ro)<br clear=3D"none">&gt; for Send/Receive numbers per all Task Managers.<=
br clear=3D"none">&gt; The Dashboard shows all the right stuff. Top shows n=
ot much of resources<br clear=3D"none">&gt; being stressed on any of the no=
des.I can share its contents if it helps<br clear=3D"none">&gt; diagnosing =
the issue.Thanks + I appreciate your valuable time, response &amp;<br clear=
=3D"none">&gt; help.Amir-<br clear=3D"none"></div><br><br></div> </div> </d=
iv>  </div></div></body></html>
------=_Part_1968797_820934192.1473790456772--