flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Flink 1.0.0 reading files from multiple directory with wildcards
Date Tue, 22 Mar 2016 23:00:22 GMT
Hi Gna,

thanks for sharing the good news and opening the JIRA!

Cheers, Fabian

2016-03-22 23:30 GMT+01:00 Sourigna Phetsarath <gna.phetsarath@teamaol.com>:

> Ufek & Fabian,
>
> FYI,  I was about to extend the FileInputFormat and extend the createInputSplits
> to handle multiple Path - there was an improvement of reduced resource
> usage and increased performance of the job.
>
> Also added this ticket: https://issues.apache.org/jira/browse/FLINK-3655
>
> -Gna
>
> On Mon, Mar 21, 2016 at 10:04 AM, Sourigna Phetsarath <
> gna.phetsarath@teamaol.com> wrote:
>
>> Fabian,
>>
>> I'll try extending InputFormat as you suggested and will create a JIRA
>> issue as well.
>>
>> I also have an AvroGenericRecordInput format class that I would like to
>> contribute once I have time to clean it up and get it into your code base.
>>
>> -Gna
>>
>> On Mon, Mar 21, 2016 at 6:35 AM, Fabian Hueske <fhueske@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> no, this is currently not supported. However, I agree this would be a
>>> very valuable addition to the FileInputFormat.
>>> Would you mind opening a JIRA issue with your suggestions?
>>>
>>> Until this is added to Flink, it can be implemented as a custom
>>> InputFormat based on FileInputFormat by overriding the createInputSplits()
>>> method.
>>>
>>> Best, Fabian
>>>
>>> 2016-03-21 0:11 GMT+01:00 Sourigna Phetsarath <
>>> gna.phetsarath@teamaol.com>:
>>>
>>>> All,
>>>>
>>>> Do any of the Flink Data Sources support comma separated directories
>>>> with wildcards?
>>>>
>>>> For example:
>>>>
>>>> env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,
>>>> /data/2016/01/03/*/*")
>>>>
>>>>
>>>> Thanks in advance for any help that you can provide.
>>>> --
>>>>
>>>>
>>>> *Gna Phetsarath*System Architect // AOL Platforms // Data Services //
>>>> Applied Research Chapter
>>>> 770 Broadway, 5th Floor, New York, NY 10003
>>>> o: 212.402.4871 // m: 917.373.7363
>>>> vvmr: 8890237 aim: sphetsarath20 t: @sourigna
>>>>
>>>> * <http://www.aolplatforms.com>*
>>>>
>>>
>>>
>>
>>
>> --
>>
>>
>> *Gna Phetsarath*System Architect // AOL Platforms // Data Services //
>> Applied Research Chapter
>> 770 Broadway, 5th Floor, New York, NY 10003
>> o: 212.402.4871 // m: 917.373.7363
>> vvmr: 8890237 aim: sphetsarath20 t: @sourigna
>>
>> * <http://www.aolplatforms.com>*
>>
>
>
>
> --
>
>
> *Gna Phetsarath*System Architect // AOL Platforms // Data Services //
> Applied Research Chapter
> 770 Broadway, 5th Floor, New York, NY 10003
> o: 212.402.4871 // m: 917.373.7363
> vvmr: 8890237 aim: sphetsarath20 t: @sourigna
>
> * <http://www.aolplatforms.com>*
>

Mime
View raw message