flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sourigna Phetsarath <gna.phetsar...@teamaol.com>
Subject Re: Flink 1.0.0 reading files from multiple directory with wildcards
Date Wed, 23 Mar 2016 13:47:49 GMT
Great!  I will, once I clear it with the legal team here.

On Wed, Mar 23, 2016 at 6:19 AM, Ufuk Celebi <uce@apache.org> wrote:

> Nice! Would you like to contribute this to Flink via a pull request? Some
> resources about the contribution process can be found here:
>
> http://flink.apache.org/contribute-code.html
> http://flink.apache.org/how-to-contribute.html
>
> On Wed, Mar 23, 2016 at 12:00 AM, Fabian Hueske <fhueske@gmail.com> wrote:
>
>> Hi Gna,
>>
>> thanks for sharing the good news and opening the JIRA!
>>
>> Cheers, Fabian
>>
>> 2016-03-22 23:30 GMT+01:00 Sourigna Phetsarath <
>> gna.phetsarath@teamaol.com>:
>>
>>> Ufek & Fabian,
>>>
>>> FYI,  I was about to extend the FileInputFormat and extend the createInputSplits
>>> to handle multiple Path - there was an improvement of reduced resource
>>> usage and increased performance of the job.
>>>
>>> Also added this ticket: https://issues.apache.org/jira/browse/FLINK-3655
>>>
>>> -Gna
>>>
>>> On Mon, Mar 21, 2016 at 10:04 AM, Sourigna Phetsarath <
>>> gna.phetsarath@teamaol.com> wrote:
>>>
>>>> Fabian,
>>>>
>>>> I'll try extending InputFormat as you suggested and will create a JIRA
>>>> issue as well.
>>>>
>>>> I also have an AvroGenericRecordInput format class that I would like to
>>>> contribute once I have time to clean it up and get it into your code base.
>>>>
>>>> -Gna
>>>>
>>>> On Mon, Mar 21, 2016 at 6:35 AM, Fabian Hueske <fhueske@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> no, this is currently not supported. However, I agree this would be a
>>>>> very valuable addition to the FileInputFormat.
>>>>> Would you mind opening a JIRA issue with your suggestions?
>>>>>
>>>>> Until this is added to Flink, it can be implemented as a custom
>>>>> InputFormat based on FileInputFormat by overriding the createInputSplits()
>>>>> method.
>>>>>
>>>>> Best, Fabian
>>>>>
>>>>> 2016-03-21 0:11 GMT+01:00 Sourigna Phetsarath <
>>>>> gna.phetsarath@teamaol.com>:
>>>>>
>>>>>> All,
>>>>>>
>>>>>> Do any of the Flink Data Sources support comma separated directories
>>>>>> with wildcards?
>>>>>>
>>>>>> For example:
>>>>>>
>>>>>> env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,
>>>>>> /data/2016/01/03/*/*")
>>>>>>
>>>>>>
>>>>>> Thanks in advance for any help that you can provide.
>>>>>> --
>>>>>>
>>>>>>
>>>>>> *Gna Phetsarath*System Architect // AOL Platforms // Data Services
>>>>>> // Applied Research Chapter
>>>>>> 770 Broadway, 5th Floor, New York, NY 10003
>>>>>> o: 212.402.4871 // m: 917.373.7363
>>>>>> vvmr: 8890237 aim: sphetsarath20 t: @sourigna
>>>>>>
>>>>>> * <http://www.aolplatforms.com>*
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> *Gna Phetsarath*System Architect // AOL Platforms // Data Services //
>>>> Applied Research Chapter
>>>> 770 Broadway, 5th Floor, New York, NY 10003
>>>> o: 212.402.4871 // m: 917.373.7363
>>>> vvmr: 8890237 aim: sphetsarath20 t: @sourigna
>>>>
>>>> * <http://www.aolplatforms.com>*
>>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>> *Gna Phetsarath*System Architect // AOL Platforms // Data Services //
>>> Applied Research Chapter
>>> 770 Broadway, 5th Floor, New York, NY 10003
>>> o: 212.402.4871 // m: 917.373.7363
>>> vvmr: 8890237 aim: sphetsarath20 t: @sourigna
>>>
>>> * <http://www.aolplatforms.com>*
>>>
>>
>>
>


-- 


*Gna Phetsarath*System Architect // AOL Platforms // Data Services //
Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: 212.402.4871 // m: 917.373.7363
vvmr: 8890237 aim: sphetsarath20 t: @sourigna

* <http://www.aolplatforms.com>*

Mime
View raw message