flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dulaj Viduranga <vidura...@icloud.com>
Subject Re: Could not build up connection to JobManager
Date Mon, 16 Mar 2015 17:11:52 GMT
Hi,
I tested the update but it’s still the same. I think it isn’t a problem with my system
because, I have an XAMPP server working totally fine (I tried with it is shut down as well)
and also I doubly checked hosts files. I had little snitch installed but I also tried uninstalling
it. 
Isn’t there a way around without using DNS to resolve localhost?

> On Mar 16, 2015, at 10:04 PM, Till Rohrmann <trohrmann@apache.org> wrote:
> 
> It is really strange. It's right that the CliFrontend now resolves
> localhost to the correct local address 10.218.100.122. Moreover, according
> to the logs, the JobManager is also started and binds to akka.tcp://
> flink@10.218.100.122:6123. According to the logs, this is also the address
> the CliFrontend uses to connect to the JobManager. If the timestamps are
> correct, then the JobManager was still alive when the job was sent. I don't
> really understand why this happens. Can it be that the CliFrontend which
> binds to 127.0.0.1 cannot communicate with 10.218.100.122? Can it be that
> you have some settings which prevent this? For the failing 127.0.0.1 case,
> it would be helpful to have access to the JobManager log.
> 
> I've updated the branch
> https://github.com/tillrohrmann/flink/tree/fixJobClient with a new fix for
> the "localhost" scenario. Could you try it out again? Thanks a lot for your
> help.
> 
> Best regards,
> 
> Till
> 
> On Mon, Mar 16, 2015 at 10:30 AM, Ufuk Celebi <uce@apache.org> wrote:
> 
>> There was an issue for this:
>> https://issues.apache.org/jira/browse/FLINK-1634
>> 
>> Can we close it then?
>> 
>> On Sat, Mar 14, 2015 at 9:16 PM, Dulaj Viduranga <vidura.me@icloud.com>
>> wrote:
>> 
>>> Hay Stephan,
>>> Great to know you could fix the issue. Thank you on the update.
>>> Best regards.
>>> 
>>>> On Mar 14, 2015, at 9:19 PM, Stephan Ewen <sewen@apache.org> wrote:
>>>> 
>>>> Hey Dulaj!
>>>> 
>>>> Forget what I said in the previous email. The issue with the wrong
>>> address
>>>> binding seems to be solved now. There is another issue that the
>> embedded
>>>> taskmanager does not start properly, for whatever reason. My gut
>> feeling
>>> is
>>>> that there is something wrong
>>>> 
>>>> There is a patch pending that changes the startup behavior to debug
>> these
>>>> situations much easier. I'll ping you as soon as that is in...
>>>> 
>>>> 
>>>> Stephan
>>>> 
>>>> On Sat, Mar 14, 2015 at 4:42 PM, Stephan Ewen <sewen@apache.org>
>> wrote:
>>>> 
>>>>> Hey Dulaj!
>>>>> 
>>>>> One thing you can try is to add to the JVM startup options (in the
>>> scripts
>>>>> in the "bin" folder) the option "-Djava.net.preferIPv4Stack=true" and
>>> see
>>>>> if that helps it?
>>>>> 
>>>>> Stephan
>>>>> 
>>>>> 
>>>>> On Sat, Mar 14, 2015 at 4:29 AM, Dulaj Viduranga <
>> vidura.me@icloud.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> Still this is no luck. I’ll upload the logs with configuration
>>>>>> “localhost" as well as “127.0.0.1” so you can take a look.
>>>>>> 
>>>>>> 127.0.0.1
>>>>>> flink-Vidura-flink-client-localhost.log <
>>>>>> 
>>> 
>> https://gist.github.com/viduranga/1d01149eee238158519e#file-flink-vidura-flink-client-localhost-log
>>>>>>> 
>>>>>> 
>>>>>> localhost
>>>>>> flink-Vidura-flink-client-localhost.log <
>>>>>> 
>>> 
>> https://gist.github.com/viduranga/d866c24c0ba566abab17#file-flink-vidura-flink-client-localhost-log
>>>>>>> 
>>>>>> flink-Vidura-jobmanager-localhost.log <
>>>>>> 
>>> 
>> https://gist.github.com/viduranga/e7549ef818c6a2af73e9#file-flink-vidura-jobmanager-localhost-log
>>>>>>> 
>>>>>> 
>>>>>>> On Mar 11, 2015, at 11:32 PM, Till Rohrmann <trohrmann@apache.org>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi Dulaj,
>>>>>>> 
>>>>>>> sorry for my late response. It looks as if the JobClient tries
to
>>>>>> connect
>>>>>>> to the JobManager using its IPv6 instead of IPv4. Akka is really
>> picky
>>>>>> when
>>>>>>> it comes to remote address. If Akka binds to the FQDN, then other
>>>>>>> ActorSystem which try to connect to it using its IP address won't
be
>>>>>>> successful. I assume that this might be a problem. I tried to
fix
>> it.
>>>>>> You
>>>>>>> can find it here [1]. Could you please try it out by starting
a
>> local
>>>>>>> cluster with the start-local.sh script. If it fails, could you
>> please
>>>>>> send
>>>>>>> me all log files (client, jobmanager and taskmanager). Once we
>> figured
>>>>>> out
>>>>>>> why the JobCilent does not connect, we can try to tackle the
>>> BlobServer
>>>>>>> issue.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> 
>>>>>>> Till
>>>>>>> 
>>>>>>> [1] https://github.com/tillrohrmann/flink/tree/fixJobClient
>>>>>>> 
>>>>>>> On Thu, Mar 5, 2015 at 4:40 PM, Dulaj Viduranga <
>> vidura.me@icloud.com
>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> The error message is,
>>>>>>>> 
>>>>>>>> 21:06:01,521 WARN  org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>     - Unable to load native-hadoop library for your platform...
>>> using
>>>>>>>> builtin-java classes where applicable
>>>>>>>> org.apache.flink.client.program.ProgramInvocationException:
Could
>> not
>>>>>>>> build up connection to JobManager.
>>>>>>>>      at
>> org.apache.flink.client.program.Client.run(Client.java:327)
>>>>>>>>      at
>> org.apache.flink.client.program.Client.run(Client.java:306)
>>>>>>>>      at
>> org.apache.flink.client.program.Client.run(Client.java:300)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:55)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:82)
>>>>>>>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>      at java.lang.reflect.Method.invoke(Method.java:483)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
>>>>>>>>      at
>> org.apache.flink.client.program.Client.run(Client.java:250)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:371)
>>>>>>>>      at
>>> org.apache.flink.client.CliFrontend.run(CliFrontend.java:344)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1087)
>>>>>>>>      at
>>>>>> org.apache.flink.client.CliFrontend.main(CliFrontend.java:1114)
>>>>>>>> Caused by: java.io.IOException: JobManager at akka.tcp://flink@fe80
>>>>>> :0:0:0:742b:7f78:fab5:68e2%11:6123/user/jobmanager
>>>>>>>> not reachable. Please make sure that the JobManager is running
and
>>> its
>>>>>> port
>>>>>>>> is reachable.
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.runtime.jobmanager.JobManager$.getJobManagerRemoteReference(JobManager.scala:957)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.runtime.client.JobClient$.createJobClient(JobClient.scala:151)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.runtime.client.JobClient$.createJobClientFromConfig(JobClient.scala:142)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.runtime.client.JobClient$.startActorSystemAndActor(JobClient.scala:125)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.runtime.client.JobClient.startActorSystemAndActor(JobClient.scala)
>>>>>>>>      at
>> org.apache.flink.client.program.Client.run(Client.java:322)
>>>>>>>>      ... 15 more
>>>>>>>> Caused by: akka.actor.ActorNotFound: Actor not found for:
>>>>>>>> ActorSelection[Anchor(akka://flink/deadLetters), Path(/)]
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
>>>>>>>>      at
>> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
>>>>>>>>      at
>>>>>>>> 
>>> scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
>>>>>>>>      at
>>>>>>>> akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:110)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> scala.concurrent.impl.Promise$DefaultPromise.scala$concurrent$impl$Promise$DefaultPromise$$dispatchOrAddCallback(Promise.scala:280)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:270)
>>>>>>>>      at
>>> akka.actor.ActorSelection.resolveOne(ActorSelection.scala:63)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.runtime.akka.AkkaUtils$.getReference(AkkaUtils.scala:321)
>>>>>>>>      at
>>>>>>>> 
>>>>>> 
>>> 
>> org.apache.flink.runtime.jobmanager.JobManager$.getJobManagerRemoteReference(JobManager.scala:952)
>>>>>>>>      ... 20 more
>>>>>>>> 
>>>>>>>> The exception above occurred while trying to run your command.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Client log doesn’t seem to show any info,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 21:06:01,521 WARN  org.apache.hadoop.util.NativeCodeLoader
>>>>>>>>     - Unable to load native-hadoop library for your platform...
>>> using
>>>>>>>> builtin-java classes where applicable
>>>>>>>> 21:06:01,935 INFO  org.apache.flink.api.java.ExecutionEnvironment
>>>>>>>>    - The job has 0 registered types and 0 default Kryo serializers
>>>>>>>> 21:06:02,857 INFO  akka.event.slf4j.Slf4jLogger
>>>>>>>>    - Slf4jLogger started
>>>>>>>> 21:06:02,909 INFO  Remoting
>>>>>>>>    - Starting remoting
>>>>>>>> 21:06:03,158 INFO  Remoting
>>>>>>>>    - Remoting started; listening on addresses :[akka.tcp://
>>>>>>>> flink@127.0.0.1:49463]
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>>> 
>> 


Mime
View raw message