flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ned dogg <neddog...@gmail.com>
Subject Re: Issue regarding the submission of a topology to a remote flink cluster.
Date Thu, 14 Apr 2016 08:41:16 GMT
2016-04-14 08:23:51,900 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
--------------------------------------------------------------------------------
2016-04-14 08:23:51,902 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -  Starting
JobManager (Version: 1.0.1, Rev:4afa401, Date:31.03.2016 @ 13:40:33 UTC)
2016-04-14 08:23:51,902 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -  Current
user: root
2016-04-14 08:23:51,902 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -  JVM:
OpenJDK 64-Bit Server VM - Oracle Corporation - 1.7/24.95-b01
2016-04-14 08:23:51,902 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -  Maximum
heap size: 247 MiBytes
2016-04-14 08:23:51,902 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
 JAVA_HOME: /usr/lib/jvm/java-7-openjdk-amd64
2016-04-14 08:23:51,929 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -  Hadoop
version: 1.2.1
2016-04-14 08:23:51,929 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -  JVM
Options:
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
-Xms256m
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
-Xmx256m
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
-XX:MaxPermSize=256m
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
-Dlog.file=/home/ubuntu/flink/log/flink-root-jobmanager-0-tresor-testflinkth.1541841583.streamly.com.log
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
-Dlog4j.configuration=file:/home/ubuntu/flink/conf/log4j.properties
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
-Dlogback.configurationFile=file:/home/ubuntu/flink/conf/logback.xml
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -  Program
Arguments:
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
--configDir
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
/home/ubuntu/flink/conf
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
--executionMode
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -     cluster
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
 Classpath:
/home/ubuntu/flink/lib/flink-dist_2.10-1.0.1.jar:/home/ubuntu/flink/lib/flink-python_2.10-1.0.1.jar:/home/ubuntu/flink/lib/log4j-1.2.17.jar:/home/ubuntu/flink/lib/slf4j-log4j12-1.7.7.jar:::
2016-04-14 08:23:51,930 INFO
 org.apache.flink.runtime.jobmanager.JobManager                -
--------------------------------------------------------------------------------
2016-04-14 08:23:51,931 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Registered
UNIX signal handlers for [TERM, HUP, INT]
2016-04-14 08:23:52,362 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Loading
configuration from /home/ubuntu/flink/conf
2016-04-14 08:23:52,400 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Starting
JobManager without high-availability
2016-04-14 08:23:52,408 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Starting
JobManager on 172.31.45.232:6123 with execution mode CLUSTER
2016-04-14 08:23:52,655 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Security
is not enabled. Starting non-authenticated JobManager.
2016-04-14 08:23:52,701 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Starting
JobManager
2016-04-14 08:23:52,701 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Starting
JobManager actor system at 172.31.45.232:6123
2016-04-14 08:23:54,091 INFO  akka.event.slf4j.Slf4jLogger
                 - Slf4jLogger started
2016-04-14 08:23:54,293 INFO  Remoting
                 - Starting remoting
2016-04-14 08:23:54,712 INFO  Remoting
                 - Remoting started; listening on addresses :[akka.tcp://
flink@172.31.45.232:6123]
2016-04-14 08:23:54,732 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Starting
JobManger web frontend
2016-04-14 08:23:54,777 INFO
 org.apache.flink.runtime.webmonitor.WebMonitorUtils           - Determined
location of JobManager log file:
/home/ubuntu/flink/log/flink-root-jobmanager-0-tresor-testflinkth.1541841583.streamly.com.log
2016-04-14 08:23:54,777 INFO
 org.apache.flink.runtime.webmonitor.WebMonitorUtils           - Determined
location of JobManager stdout file:
/home/ubuntu/flink/log/flink-root-jobmanager-0-tresor-testflinkth.1541841583.streamly.com.out
2016-04-14 08:23:54,805 INFO
 org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Using
directory /tmp/flink-web-dadb12c4-4ac9-42d6-b127-712db44b4add for the web
interface files
2016-04-14 08:23:54,805 INFO
 org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Using
directory /tmp/flink-web-upload-88dc8823-f114-4787-a6f4-f1955380e384 for
web frontend JAR file uploads
2016-04-14 08:23:55,600 INFO
 org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Web
frontend listening at 0:0:0:0:0:0:0:0:8081
2016-04-14 08:23:55,601 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Starting
JobManager actor
2016-04-14 08:23:55,619 INFO  org.apache.flink.runtime.blob.BlobServer
                 - Created BLOB server storage directory
/tmp/blobStore-bbc7f33e-fa65-41f8-8c68-549e9707fd56
2016-04-14 08:23:55,634 INFO  org.apache.flink.runtime.blob.BlobServer
                 - Started BLOB server at 0.0.0.0:60439 - max concurrent
requests: 50 - max backlog: 1000
2016-04-14 08:23:55,653 INFO
 org.apache.flink.runtime.checkpoint.SavepointStoreFactory     - Using job
manager savepoint state backend.
2016-04-14 08:23:55,678 INFO
 org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Starting
with JobManager akka.tcp://flink@172.31.45.232:6123/user/jobmanager on port
8081
2016-04-14 08:23:55,678 INFO
 org.apache.flink.runtime.webmonitor.JobManagerRetriever       - New leader
reachable under akka.tcp://flink@172.31.45.232:6123/user/jobmanager:null.
2016-04-14 08:23:55,692 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - Starting
JobManager at akka.tcp://flink@172.31.45.232:6123/user/jobmanager.
2016-04-14 08:23:55,696 INFO
 org.apache.flink.runtime.jobmanager.MemoryArchivist           - Started
memory archivist akka://flink/user/archive
2016-04-14 08:23:55,702 INFO
 org.apache.flink.runtime.jobmanager.JobManager                - JobManager
akka.tcp://flink@172.31.45.232:6123/user/jobmanager was granted leadership
with leader session ID None.
2016-04-14 08:24:12,740 INFO
 org.apache.flink.runtime.instance.InstanceManager             - Registered
TaskManager at tresor-testflinkth (akka.tcp://
flink@172.31.45.130:42189/user/taskmanager) as
734b6b21dd60760c7a05722db3c3d1c4. Current number of registered hosts is 1.
Current number of alive task slots is 1.
2016-04-14 08:24:17,855 INFO
 org.apache.flink.runtime.instance.InstanceManager             - Registered
TaskManager at tresor-testflinkth (akka.tcp://
flink@172.31.34.121:58814/user/taskmanager) as
fead50f6831aa3f341d58162bb918d90. Current number of registered hosts is 2.
Current number of alive task slots is 2.
2016-04-14 08:29:50,346 ERROR akka.remote.EndpointWriter
                 - dropping message [class
akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://
flink@54.233.183.228:6123/]] arriving at [akka.tcp://
flink@54.233.183.228:6123] inbound addresses are [akka.tcp://
flink@172.31.45.232:6123]
2016-04-14 08:29:59,777 WARN  akka.remote.ReliableDeliverySupervisor
                 - Association with remote system [akka.tcp://
flink@127.0.0.1:35953] has failed, address is now gated for [5000] ms.
Reason is: [Disassociated].


Where 54.233.183.228 <http://flink@54.233.183.228:6123> is the public
address of the VM hosting job manager and 172.31.45.232
<http://flink@172.31.45.232:6123> is it private address.
172.31.45.130 <http://flink@172.31.45.130:42189/user/taskmanager> and
172.31.34.121 <http://flink@172.31.34.121:58814/user/taskmanager> are the
private address of the task manager

On Thu, Apr 14, 2016 at 9:05 AM, Till Rohrmann <trohrmann@apache.org> wrote:

> I'm referring to the jobmanager.log file not the client log file. You can
> find it in the `/log` directory.
>
> Cheers,
> Till
>
> On Thu, Apr 14, 2016 at 9:56 AM, ned dogg <neddogg90@gmail.com> wrote:
>
> > Hi Till
> >
> > Thanks for the prompt reply.
> >
> > The logs say that Please make sure that the actor is running and its port
> > is reachable.
> > And it is actaully reachable because I can ping that address.
> >
> > Ned.
> >
> > On Thu, Apr 14, 2016 at 8:43 AM, Till Rohrmann <till.rohrmann@gmail.com>
> > wrote:
> >
> > > Hi Ned,
> > >
> > > what does the logs of the JobManager say?
> > >
> > > Cheers,
> > > Till
> > > On Apr 14, 2016 9:19 AM, "ned dogg" <neddogg90@gmail.com> wrote:
> > >
> > > > Hi everybody,
> > > >
> > > > I'm Ned, a young and passionte developer of apache technologies. I
> have
> > > > been playing with apache flink lastly.
> > > >
> > > > This is what I wanted to do submit a flink topology to a remote flink
> > > > cluster. The following are the steps that I did.
> > > >
> > > > - Install flink as a cluster indicated on the link
> > > >
> > > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-master/setup/cluster_setup.html
> > > > on three remotes VMs.
> > > > - Run the sample WordCountRemoteByClient
> > > > <
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-contrib/flink-storm-examples/src/main/java/org/apache/flink/storm/wordcount/WordCountRemoteByClient.java
> > > > >
> > > > by
> > > > changing
> > > > conf.put(Config.NIMBUS_HOST, "localhost"); to
> > > > conf.put(Config.NIMBUS_HOST,
> "publicIpOfJobmanagerInMyRemoteCluster");
> > > >
> > > > Unfortunately for me when I run that program, I have a the following
> > > > exception.
> > > >
> > > > org.apache.flink.client.program.ProgramInvocationException: The main
> > > method
> > > > caused an error.
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:520)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:403)
> > > > at
> org.apache.flink.client.program.Client.runBlocking(Client.java:248)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:866)
> > > > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:333)
> > > > at
> > > >
> > >
> >
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1189)
> > > > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1239)
> > > > Caused by: java.lang.RuntimeException: Could not connect to Flink
> > > > JobManager with address publicIpOfJobmanagerInMyRemoteCluster:6123
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.getTopologyJobId(FlinkClient.java:305)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.submitTopologyWithOpts(FlinkClient.java:177)
> > > > at
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.submitTopology(FlinkClient.java:167)
> > > > at
> > > >
> > > >
> > >
> >
> stormWorldCount.WordCountRemoteByClient.main(WordCountRemoteByClient.java:72)
> > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > > at
> > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > > > at
> > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > > at java.lang.reflect.Method.invoke(Method.java:483)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:505)
> > > > ... 6 more
> > > > Caused by: java.io.IOException: Actor at akka.tcp://flink@
> > > > publicIpOfJobmanagerInMyRemoteCluster:6123/user/jobmanager not
> > reachable.
> > > > Please make sure that the actor is running and its port is reachable.
> > > > at
> > > >
> > org.apache.flink.runtime.akka.AkkaUtils$.getActorRef(AkkaUtils.scala:384)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.jobmanager.JobManager$.getJobManagerActorRef(JobManager.scala:2380)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.jobmanager.JobManager$.getJobManagerActorRef(JobManager.scala:2400)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.runtime.jobmanager.JobManager.getJobManagerActorRef(JobManager.scala)
> > > > at
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.getJobManager(FlinkClient.java:333)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.flink.storm.api.FlinkClient.getTopologyJobId(FlinkClient.java:279)
> > > > ... 14 more
> > > > Caused by: java.util.concurrent.TimeoutException: Futures timed out
> > after
> > > > [10000 milliseconds]
> > > > at
> > scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> > > > at
> > scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> > > > at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
> > > > at
> > > >
> > > >
> > >
> >
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> > > > at scala.concurrent.Await$.result(package.scala:107)
> > > > at
> > > >
> > org.apache.flink.runtime.akka.AkkaUtils$.getActorRef(AkkaUtils.scala:380)
> > > > ... 19 more
> > > >
> > > > I try ping my jobmanager with
> > > > curl publicIpOfJobmanagerInMyRemoteCluster:6123 I had the following
> as
> > > > responces.
> > > >
> > > > curl: (52) Empty reply from server
> > > >
> > > > Which is an indication that the job manager is reachable.
> > > >
> > > > So I was wondering if I doing it the right way. Please any help will
> be
> > > > welcoming.
> > > >
> > > > Thanks,
> > > > Ned
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message