flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Averell <lvhu...@gmail.com>
Subject Re: FileInputFormat that processes files in chronological order
Date Sun, 28 Apr 2019 06:39:27 GMT
Hi,

Regarding splitting by shards, I believe that you can simply create two
sources, one for each shard. After that, union them together.

Regarding processing files in chronological order, Flink currently reads
files using the files' last-modified-time order (i.e. oldest files will be
processed first). So if your file1.json is older than file2, file2 is older
than file3, then you don't need to do anything.
If your file-times are not in that order, then I think its more complex. But
I am curious about why there are such requirements first. Is this a
streaming problem?

I don't think FileInputFormat has anything to do here. Use that when your
files are in a format not currently supported by Flink.

Regards,
Averell  



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Mime
View raw message