flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sourigna Phetsarath <gna.phetsar...@teamaol.com>
Subject Re: Flink 1.0.0 reading files from multiple directory with wildcards
Date Tue, 22 Mar 2016 22:30:13 GMT
Ufek & Fabian,

FYI,  I was about to extend the FileInputFormat and extend the
createInputSplits
to handle multiple Path - there was an improvement of reduced resource
usage and increased performance of the job.

Also added this ticket: https://issues.apache.org/jira/browse/FLINK-3655

-Gna

On Mon, Mar 21, 2016 at 10:04 AM, Sourigna Phetsarath <
gna.phetsarath@teamaol.com> wrote:

> Fabian,
>
> I'll try extending InputFormat as you suggested and will create a JIRA
> issue as well.
>
> I also have an AvroGenericRecordInput format class that I would like to
> contribute once I have time to clean it up and get it into your code base.
>
> -Gna
>
> On Mon, Mar 21, 2016 at 6:35 AM, Fabian Hueske <fhueske@gmail.com> wrote:
>
>> Hi,
>>
>> no, this is currently not supported. However, I agree this would be a
>> very valuable addition to the FileInputFormat.
>> Would you mind opening a JIRA issue with your suggestions?
>>
>> Until this is added to Flink, it can be implemented as a custom
>> InputFormat based on FileInputFormat by overriding the createInputSplits()
>> method.
>>
>> Best, Fabian
>>
>> 2016-03-21 0:11 GMT+01:00 Sourigna Phetsarath <gna.phetsarath@teamaol.com
>> >:
>>
>>> All,
>>>
>>> Do any of the Flink Data Sources support comma separated directories
>>> with wildcards?
>>>
>>> For example:
>>>
>>> env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,
>>> /data/2016/01/03/*/*")
>>>
>>>
>>> Thanks in advance for any help that you can provide.
>>> --
>>>
>>>
>>> *Gna Phetsarath*System Architect // AOL Platforms // Data Services //
>>> Applied Research Chapter
>>> 770 Broadway, 5th Floor, New York, NY 10003
>>> o: 212.402.4871 // m: 917.373.7363
>>> vvmr: 8890237 aim: sphetsarath20 t: @sourigna
>>>
>>> * <http://www.aolplatforms.com>*
>>>
>>
>>
>
>
> --
>
>
> *Gna Phetsarath*System Architect // AOL Platforms // Data Services //
> Applied Research Chapter
> 770 Broadway, 5th Floor, New York, NY 10003
> o: 212.402.4871 // m: 917.373.7363
> vvmr: 8890237 aim: sphetsarath20 t: @sourigna
>
> * <http://www.aolplatforms.com>*
>



-- 


*Gna Phetsarath*System Architect // AOL Platforms // Data Services //
Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: 212.402.4871 // m: 917.373.7363
vvmr: 8890237 aim: sphetsarath20 t: @sourigna

* <http://www.aolplatforms.com>*

Mime
View raw message