flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3964) Job submission times out with recursive.file.enumeration
Date Mon, 27 Jun 2016 13:15:52 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350969#comment-15350969
] 

ASF GitHub Bot commented on FLINK-3964:
---------------------------------------

Github user gyfora commented on the issue:

    https://github.com/apache/flink/pull/2168
  
    I think it is often the case that the timeout is caused by some failure. For instance
the savepoint restore failes due to something, then you get this timeout in the command line.
    
    Maybe it's also good to add a note to check the JM log for any failures because increasing
the timeout wont help.
    
    Just a thought :)


> Job submission times out with recursive.file.enumeration
> --------------------------------------------------------
>
>                 Key: FLINK-3964
>                 URL: https://issues.apache.org/jira/browse/FLINK-3964
>             Project: Flink
>          Issue Type: Bug
>          Components: Batch Connectors and Input/Output Formats, DataSet API
>    Affects Versions: 1.0.0
>            Reporter: Juho Autio
>
> When using "recursive.file.enumeration" with a big enough folder structure to list, flink
batch job fails right at the beginning because of a timeout.
> h2. Problem details
> We get this error: {{Communication with JobManager failed: Job submission to the JobManager
timed out}}.
> The code we have is basically this:
> {code}
> val env = ExecutionEnvironment.getExecutionEnvironment
> val parameters = new Configuration
> // set the recursive enumeration parameter
> parameters.setBoolean("recursive.file.enumeration", true)
> val parameter = ParameterTool.fromArgs(args)
> val input_data_path : String = parameter.get("input_data_path", null )
> val data : DataSet[(Text,Text)] = env.readSequenceFile(classOf[Text], classOf[Text],
input_data_path)
> .withParameters(parameters)
> data.first(10).print
> {code}
> If we set {{input_data_path}} parameter to {{s3n://bucket/path/date=*/}} it times out.
If we use a more restrictive pattern like {{s3n://bucket/path/date=20160523/}}, it doesn't
time out.
> To me it seems that time taken to list files shouldn't cause any timeouts on job submission
level.
> For us this was "fixed" by adding {{akka.client.timeout: 600 s}} in {{flink-conf.yaml}},
but I wonder if the timeout would still occur if we have even more files to list?
> ----
> P.S. Is there any way to set {{akka.client.timeout}} when calling {{bin/flink run}} instead
of editing {{flink-conf.yaml}}. I tried to add it as a {{-yD}} flag but couldn't get it working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message