flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Richter <s.rich...@data-artisans.com>
Subject Re: Recursive directory traversal with TextInputFormat
Date Wed, 07 Dec 2016 16:44:49 GMT

I think there is a field in FileInputFormat (which TextInputFormat is subclassing) that could
serve your purpose if you override the default:

 * The flag to specify whether recursive traversal of the input directory
 * structure is enabled.
protected boolean enumerateNestedFiles = false;
As for compression, I think this class also provides a InflaterInputStreamFactory to read
compressed data.


> Am 07.12.2016 um 12:10 schrieb Lukas Kircher <lukas.kircher@uni-konstanz.de>:
> Hi all,
> I am trying to read nested .csv files from a directory and want to switch from a custom
SourceFunction I implemented to the TextInputFormat. I have two questions:
> 1) Somehow only the file in the root directory is processed, nested files are skipped.
What am I missing? See the attachment for an SSCCE. I get the same result with flink 1.1.3
no matter if I run it via the IDE or submit the job to the standalone binary. The file input
splits are all there, yet they don't seem to be processed.
> 2) What is the easiest way to read compressed .csv files (.zip)?
> Thanks for your help, cheers
> Lukas
> <ReadDirectorySSCCE.java>

View raw message