flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ned dogg <neddog...@gmail.com>
Subject Re: Issue regarding the submission of a topology to a remote flink cluster.
Date Thu, 14 Apr 2016 09:11:43 GMT
Thanks Till for the reply.

But according to you how can I address that?

Thanks,
Ned

On Thu, Apr 14, 2016 at 9:56 AM, Till Rohrmann <till.rohrmann@gmail.com>
wrote:

> Hi Ned,
>
> I think you are facing the issue described in this JIRA issue [1]. The
> problem is that you have a private and a public IP address and that Akka
> binds to the private IP address. Since the registered IP of an ActorSystem
> and the target IP address of a request to this ActorSystem have to be
> matching, you cannot reach the ActorSystem via the public IP address.
> Requests with a non-matching IP address are discarded, as indicated by the
> last log statements.
>
> [1] https://issues.apache.org/jira/browse/FLINK-2821
>
> Cheers,
> Till
>
> On Thu, Apr 14, 2016 at 10:41 AM, ned dogg <neddogg90@gmail.com> wrote:
>
> > 2016-04-14 08:23:51,900 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> >
> >
> --------------------------------------------------------------------------------
> > 2016-04-14 08:23:51,902 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> Starting
> > JobManager (Version: 1.0.1, Rev:4afa401, Date:31.03.2016 @ 13:40:33 UTC)
> > 2016-04-14 08:23:51,902 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -  Current
> > user: root
> > 2016-04-14 08:23:51,902 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -  JVM:
> > OpenJDK 64-Bit Server VM - Oracle Corporation - 1.7/24.95-b01
> > 2016-04-14 08:23:51,902 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -  Maximum
> > heap size: 247 MiBytes
> > 2016-04-14 08:23:51,902 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> >  JAVA_HOME: /usr/lib/jvm/java-7-openjdk-amd64
> > 2016-04-14 08:23:51,929 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -  Hadoop
> > version: 1.2.1
> > 2016-04-14 08:23:51,929 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -  JVM
> > Options:
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> > -Xms256m
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> > -Xmx256m
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> > -XX:MaxPermSize=256m
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> >
> >
> -Dlog.file=/home/ubuntu/flink/log/flink-root-jobmanager-0-tresor-testflinkth.1541841583.streamly.com.log
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> > -Dlog4j.configuration=file:/home/ubuntu/flink/conf/log4j.properties
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> > -Dlogback.configurationFile=file:/home/ubuntu/flink/conf/logback.xml
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -  Program
> > Arguments:
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> > --configDir
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> > /home/ubuntu/flink/conf
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> > --executionMode
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> >  cluster
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> >  Classpath:
> >
> >
> /home/ubuntu/flink/lib/flink-dist_2.10-1.0.1.jar:/home/ubuntu/flink/lib/flink-python_2.10-1.0.1.jar:/home/ubuntu/flink/lib/log4j-1.2.17.jar:/home/ubuntu/flink/lib/slf4j-log4j12-1.7.7.jar:::
> > 2016-04-14 08:23:51,930 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> >
> >
> --------------------------------------------------------------------------------
> > 2016-04-14 08:23:51,931 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> Registered
> > UNIX signal handlers for [TERM, HUP, INT]
> > 2016-04-14 08:23:52,362 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Loading
> > configuration from /home/ubuntu/flink/conf
> > 2016-04-14 08:23:52,400 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Starting
> > JobManager without high-availability
> > 2016-04-14 08:23:52,408 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Starting
> > JobManager on 172.31.45.232:6123 with execution mode CLUSTER
> > 2016-04-14 08:23:52,655 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Security
> > is not enabled. Starting non-authenticated JobManager.
> > 2016-04-14 08:23:52,701 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Starting
> > JobManager
> > 2016-04-14 08:23:52,701 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Starting
> > JobManager actor system at 172.31.45.232:6123
> > 2016-04-14 08:23:54,091 INFO  akka.event.slf4j.Slf4jLogger
> >                  - Slf4jLogger started
> > 2016-04-14 08:23:54,293 INFO  Remoting
> >                  - Starting remoting
> > 2016-04-14 08:23:54,712 INFO  Remoting
> >                  - Remoting started; listening on addresses :[akka.tcp://
> > flink@172.31.45.232:6123]
> > 2016-04-14 08:23:54,732 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Starting
> > JobManger web frontend
> > 2016-04-14 08:23:54,777 INFO
> >  org.apache.flink.runtime.webmonitor.WebMonitorUtils           -
> Determined
> > location of JobManager log file:
> >
> >
> /home/ubuntu/flink/log/flink-root-jobmanager-0-tresor-testflinkth.1541841583.streamly.com.log
> > 2016-04-14 08:23:54,777 INFO
> >  org.apache.flink.runtime.webmonitor.WebMonitorUtils           -
> Determined
> > location of JobManager stdout file:
> >
> >
> /home/ubuntu/flink/log/flink-root-jobmanager-0-tresor-testflinkth.1541841583.streamly.com.out
> > 2016-04-14 08:23:54,805 INFO
> >  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Using
> > directory /tmp/flink-web-dadb12c4-4ac9-42d6-b127-712db44b4add for the web
> > interface files
> > 2016-04-14 08:23:54,805 INFO
> >  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Using
> > directory /tmp/flink-web-upload-88dc8823-f114-4787-a6f4-f1955380e384 for
> > web frontend JAR file uploads
> > 2016-04-14 08:23:55,600 INFO
> >  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Web
> > frontend listening at 0:0:0:0:0:0:0:0:8081
> > 2016-04-14 08:23:55,601 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Starting
> > JobManager actor
> > 2016-04-14 08:23:55,619 INFO  org.apache.flink.runtime.blob.BlobServer
> >                  - Created BLOB server storage directory
> > /tmp/blobStore-bbc7f33e-fa65-41f8-8c68-549e9707fd56
> > 2016-04-14 08:23:55,634 INFO  org.apache.flink.runtime.blob.BlobServer
> >                  - Started BLOB server at 0.0.0.0:60439 - max concurrent
> > requests: 50 - max backlog: 1000
> > 2016-04-14 08:23:55,653 INFO
> >  org.apache.flink.runtime.checkpoint.SavepointStoreFactory     - Using
> job
> > manager savepoint state backend.
> > 2016-04-14 08:23:55,678 INFO
> >  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Starting
> > with JobManager akka.tcp://flink@172.31.45.232:6123/user/jobmanager on
> > port
> > 8081
> > 2016-04-14 08:23:55,678 INFO
> >  org.apache.flink.runtime.webmonitor.JobManagerRetriever       - New
> leader
> > reachable under akka.tcp://flink@172.31.45.232:6123/user/jobmanager:null
> .
> > 2016-04-14 08:23:55,692 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                - Starting
> > JobManager at akka.tcp://flink@172.31.45.232:6123/user/jobmanager.
> > 2016-04-14 08:23:55,696 INFO
> >  org.apache.flink.runtime.jobmanager.MemoryArchivist           - Started
> > memory archivist akka://flink/user/archive
> > 2016-04-14 08:23:55,702 INFO
> >  org.apache.flink.runtime.jobmanager.JobManager                -
> JobManager
> > akka.tcp://flink@172.31.45.232:6123/user/jobmanager was granted
> leadership
> > with leader session ID None.
> > 2016-04-14 08:24:12,740 INFO
> >  org.apache.flink.runtime.instance.InstanceManager             -
> Registered
> > TaskManager at tresor-testflinkth (akka.tcp://
> > flink@172.31.45.130:42189/user/taskmanager) as
> > 734b6b21dd60760c7a05722db3c3d1c4. Current number of registered hosts is
> 1.
> > Current number of alive task slots is 1.
> > 2016-04-14 08:24:17,855 INFO
> >  org.apache.flink.runtime.instance.InstanceManager             -
> Registered
> > TaskManager at tresor-testflinkth (akka.tcp://
> > flink@172.31.34.121:58814/user/taskmanager) as
> > fead50f6831aa3f341d58162bb918d90. Current number of registered hosts is
> 2.
> > Current number of alive task slots is 2.
> > 2016-04-14 08:29:50,346 ERROR akka.remote.EndpointWriter
> >                  - dropping message [class
> > akka.actor.ActorSelectionMessage] for non-local recipient
> > [Actor[akka.tcp://
> > flink@54.233.183.228:6123/]] arriving at [akka.tcp://
> > flink@54.233.183.228:6123] inbound addresses are [akka.tcp://
> > flink@172.31.45.232:6123]
> > 2016-04-14 08:29:59,777 WARN  akka.remote.ReliableDeliverySupervisor
> >                  - Association with remote system [akka.tcp://
> > flink@127.0.0.1:35953] has failed, address is now gated for [5000] ms.
> > Reason is: [Disassociated].
> >
> >
> > Where 54.233.183.228 <http://flink@54.233.183.228:6123> is the public
> > address of the VM hosting job manager and 172.31.45.232
> > <http://flink@172.31.45.232:6123> is it private address.
> > 172.31.45.130 <http://flink@172.31.45.130:42189/user/taskmanager> and
> > 172.31.34.121 <http://flink@172.31.34.121:58814/user/taskmanager> are
> the
> > private address of the task manager
> >
> > On Thu, Apr 14, 2016 at 9:05 AM, Till Rohrmann <trohrmann@apache.org>
> > wrote:
> >
> > > I'm referring to the jobmanager.log file not the client log file. You
> can
> > > find it in the `/log` directory.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Apr 14, 2016 at 9:56 AM, ned dogg <neddogg90@gmail.com> wrote:
> > >
> > > > Hi Till
> > > >
> > > > Thanks for the prompt reply.
> > > >
> > > > The logs say that Please make sure that the actor is running and its
> > port
> > > > is reachable.
> > > > And it is actaully reachable because I can ping that address.
> > > >
> > > > Ned.
> > > >
> > > > On Thu, Apr 14, 2016 at 8:43 AM, Till Rohrmann <
> > till.rohrmann@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Ned,
> > > > >
> > > > > what does the logs of the JobManager say?
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > > On Apr 14, 2016 9:19 AM, "ned dogg" <neddogg90@gmail.com> wrote:
> > > > >
> > > > > > Hi everybody,
> > > > > >
> > > > > > I'm Ned, a young and passionte developer of apache technologies.
> I
> > > have
> > > > > > been playing with apache flink lastly.
> > > > > >
> > > > > > This is what I wanted to do submit a flink topology to a remote
> > flink
> > > > > > cluster. The following are the steps that I did.
> > > > > >
> > > > > > - Install flink as a cluster indicated on the link
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-master/setup/cluster_setup.html
> > > > > > on three remotes VMs.
> > > > > > - Run the sample WordCountRemoteByClient
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-contrib/flink-storm-examples/src/main/java/org/apache/flink/storm/wordcount/WordCountRemoteByClient.java
> > > > > > >
> > > > > > by
> > > > > > changing
> > > > > > conf.put(Config.NIMBUS_HOST, "localhost"); to
> > > > > > conf.put(Config.NIMBUS_HOST,
> > > "publicIpOfJobmanagerInMyRemoteCluster");
> > > > > >
> > > > > > Unfortunately for me when I run that program, I have a the
> > following
> > > > > > exception.
> > > > > >
> > > > > > org.apache.flink.client.program.ProgramInvocationException:
The
> > main
> > > > > method
> > > > > > caused an error.
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:520)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:403)
> > > > > > at
> > > org.apache.flink.client.program.Client.runBlocking(Client.java:248)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:866)
> > > > > > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:333)
> > > > > > at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1189)
> > > > > > at
> org.apache.flink.client.CliFrontend.main(CliFrontend.java:1239)
> > > > > > Caused by: java.lang.RuntimeException: Could not connect to
Flink
> > > > > > JobManager with address
> publicIpOfJobmanagerInMyRemoteCluster:6123
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.getTopologyJobId(FlinkClient.java:305)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.submitTopologyWithOpts(FlinkClient.java:177)
> > > > > > at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.submitTopology(FlinkClient.java:167)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> stormWorldCount.WordCountRemoteByClient.main(WordCountRemoteByClient.java:72)
> > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > > > > at java.lang.reflect.Method.invoke(Method.java:483)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:505)
> > > > > > ... 6 more
> > > > > > Caused by: java.io.IOException: Actor at akka.tcp://flink@
> > > > > > publicIpOfJobmanagerInMyRemoteCluster:6123/user/jobmanager not
> > > > reachable.
> > > > > > Please make sure that the actor is running and its port is
> > reachable.
> > > > > > at
> > > > > >
> > > >
> > org.apache.flink.runtime.akka.AkkaUtils$.getActorRef(AkkaUtils.scala:384)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.jobmanager.JobManager$.getJobManagerActorRef(JobManager.scala:2380)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.jobmanager.JobManager$.getJobManagerActorRef(JobManager.scala:2400)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.runtime.jobmanager.JobManager.getJobManagerActorRef(JobManager.scala)
> > > > > > at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.getJobManager(FlinkClient.java:333)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.getTopologyJobId(FlinkClient.java:279)
> > > > > > ... 14 more
> > > > > > Caused by: java.util.concurrent.TimeoutException: Futures timed
> out
> > > > after
> > > > > > [10000 milliseconds]
> > > > > > at
> > > > scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> > > > > > at
> > > >
> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> > > > > > at
> > scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
> > > > > > at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> > > > > > at scala.concurrent.Await$.result(package.scala:107)
> > > > > > at
> > > > > >
> > > >
> > org.apache.flink.runtime.akka.AkkaUtils$.getActorRef(AkkaUtils.scala:380)
> > > > > > ... 19 more
> > > > > >
> > > > > > I try ping my jobmanager with
> > > > > > curl publicIpOfJobmanagerInMyRemoteCluster:6123 I had the
> following
> > > as
> > > > > > responces.
> > > > > >
> > > > > > curl: (52) Empty reply from server
> > > > > >
> > > > > > Which is an indication that the job manager is reachable.
> > > > > >
> > > > > > So I was wondering if I doing it the right way. Please any help
> > will
> > > be
> > > > > > welcoming.
> > > > > >
> > > > > > Thanks,
> > > > > > Ned
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message