flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GEOFBOT <...@git.apache.org>
Subject [GitHub] flink issue #3232: [FLINK-5183] [py] Support mulitple jobs per plan file
Date Thu, 09 Feb 2017 17:00:59 GMT
Github user GEOFBOT commented on the issue:

    It may have worked with a smaller file, but there may be issues with heavier jobs. When
I ran a more computationally intensive and time consuming job, the first job of the Python
file ran successfully. The second job of the file was then submitted:
    02/09/2017 16:39:43	DataSink (CsvSink)(4/5) switched to FINISHED 
    02/09/2017 16:39:43	Job execution switched to status FINISHED.
    2017-02-09 16:40:26,470 INFO  org.apache.flink.yarn.YarnClusterClient                
      - Waiting until all TaskManagers have connected
    Waiting until all TaskManagers have connected
    2017-02-09 16:40:26,476 INFO  org.apache.flink.yarn.YarnClusterClient                
      - TaskManager status (5/5)
    TaskManager status (5/5)
    2017-02-09 16:40:26,476 INFO  org.apache.flink.yarn.YarnClusterClient                
      - All TaskManagers are connected
    All TaskManagers are connected
    2017-02-09 16:40:26,480 INFO  org.apache.flink.yarn.YarnClusterClient                
      - Submitting job with JobID: b226f5f18a78bc386bd1b1b6d30515ea. Waiting for job completion.
    Submitting job with JobID: b226f5f18a78bc386bd1b1b6d30515ea. Waiting for job completion.
    Connected to JobManager at Actor[akka.tcp://flink@<snip>.ec2.internal:35598/user/jobmanager#68430682]
    However, Flink does not receive or respond to this new job. Instead, the client terminates
with a timeout error:
    Caused by: org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: Job
submission to the JobManager timed out. You may increase 'akka.client.timeout' in case the
JobManager needs more time to configure and confirm the job submission.
    	at org.apache.flink.runtime.client.JobSubmissionClientActor.handleCustomMessage(JobSubmissionClientActor.java:119)
    	at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:239)
    	at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88)
    	at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:68)
    	at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
    I tried setting `akka.client.timeout` to 20 minutes, but Flink is still not receiving
the second job. I suspect this may be an issue with this patch.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message