flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwenhael Pasquiers <gwenhael.pasqui...@ericsson.com>
Subject RE: Great number of jobs and numberOfBuffers
Date Thu, 31 Aug 2017 08:40:56 GMT

Well yes, I could probably make it work with a constant number of operators (and consequently
buffers) by developing specific input and output classes, and that way I'd have a workaround
for that buffers issue.

The size of my job is input-dependent mostly because my code creates one full processing chain
per input folder (there is one folder per hour of data) and that processing chain has many
functions. The number of input folders is an argument to the software and can vary from 1
to hundreds (when we re-compute something like a month of data).

for(String folder:folders){
	env = Environment.getExecutionEnvironment();
	env.readText(folder).[......].writeText(folder + "_processed");

There is one full processing chain per folder (a loop is creating them) because the name of
the output is the same as the name of the input, and I did not want to create a specific "rolling-output"
with bucketers.

So, yes, I could develop a source that supports a list of folders and puts the source name
into the produced data. My processing could also handle tuples where one field is the source
folder and use it as a key where appropriate, and finally yes I could also create a sink that
will "dispatch" the datasets to the correct folder according to that field of the tuple.

But at first that seemed too complicated (premature optimization) so I coded my job the "naïve"
way then splitted it because I thought that there would be some sort of recycling of those
buffers between jobs.

If the recycling works in normal conditions maybe the question is whether it also works when
multiple jobs are ran from within the same jar VS running the jar multiple times ?

PS: I don't have log files at hand, the app is ran on a separate and secured platform. Sure,
I could try to reproduce the issue with a mock app but I'm not sure it would help the discussion.
View raw message