flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Input from nested directory structure
Date Tue, 02 Dec 2014 15:52:09 GMT
Hi!

Not right now. The input formats do not recursively enumerate files. In
that, we followed the way Hadoop did it.

If that is something that is interesting, it should not be too hard to add
to the FileInputFormat an option to do a complete recursive traversal of
the directory structure.

Greetings,
Stephan


On Tue, Dec 2, 2014 at 4:32 PM, Vasiliki Kalavri <vasilikikalavri@gmail.com>
wrote:

> Hello all,
>
> I want to run a Flink log processing job and my input is stored locally in
> a nested directory structure, like the following:
>
> logs_dir/
> |-----/machine1/
> |-----------/january.log
> |-----------/february.log
> ...
> |-----/machine2/
> ...
>
> etc.
>
> When providing "logs_dir" as the argument to readTextFile(), nothing is
> read and no an exception or error is returned.
> Copying the nested individual files machine1/january.log,
> machine1/february.log, ..., to the same directory works fine, but I was
> wondering whether there is a better way to do this?
>
> Thank you!
> V.
>

Mime
View raw message