Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C37C195EA for ; Mon, 21 Mar 2016 14:04:36 +0000 (UTC) Received: (qmail 63589 invoked by uid 500); 21 Mar 2016 14:04:36 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 63495 invoked by uid 500); 21 Mar 2016 14:04:36 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 63484 invoked by uid 99); 21 Mar 2016 14:04:36 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Mar 2016 14:04:36 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id B964018046D for ; Mon, 21 Mar 2016 14:04:35 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.308 X-Spam-Level: * X-Spam-Status: No, score=1.308 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=teamaol-com.20150623.gappssmtp.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id kNjOpmLXJeQy for ; Mon, 21 Mar 2016 14:04:33 +0000 (UTC) Received: from mail-io0-f181.google.com (mail-io0-f181.google.com [209.85.223.181]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 6F77F5FACE for ; Mon, 21 Mar 2016 14:04:33 +0000 (UTC) Received: by mail-io0-f181.google.com with SMTP id m184so210261148iof.1 for ; Mon, 21 Mar 2016 07:04:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=teamaol-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=1npHfxKxL+oE41wdgdKR61jskJJZkL4q95TCUkA0cJQ=; b=0LLM+afnJ2WUOywWCq2U2J5adIg3sgvO8eNvAkpxZpXMPk2r0/dQIAQawS7+1xoMR5 SFt/RA3fSvABJ6UfkqX1qETRTuqgMo9RvA2bk8481+kMenQm5qP6o/aintg6SHLKsF3E k0SPsChE7mUiRPc8ITvRRDx/nwk5rl9oIdFCyZXAKf12qSwXWUaXxW6nJgEfLglEZ0UA 06bSYnVb/opOTmjIY3Fju98KUH4sr8se1yZ9E/TEsxmcSQ9k9f55SuGrLBvYnzZhXX23 3YA3psBtRibgW0c+IaXr2FyRSzbHIo4QjV4YtnFrF+rkBYanS5Uep8Tt9nd2X/1PeYAh 8NsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=1npHfxKxL+oE41wdgdKR61jskJJZkL4q95TCUkA0cJQ=; b=Fj/PKxB3rtGstD4u5hZLWhrYLGwr/0pwe0D76EPcDmMfSJGSNsQ01DPiM1wMSJ8XUn 4WV+0IXJyM2bbTSGrc+iTC7cPXDe2rm+1ZZIk6YABBgsTSWgQzQYnSWDDvax2QY9cEBS E0wUvzZaz4/ww0Hn8lB6GNDH5CAVFYaWLv66nVFyg0ZBNZzSThzxVHQTxM6v/Pe7oP0Q P9CAPnk5XIbrbg3g34F5U9JqCd+1DFPBqfOre2vxI7vs+QkKp5UwhVLUaxzvDfSzMHn0 xIdsMPg6aGhsWjCpmIfdtOHT+P3BUKnDJm1SPXLSqkSsyxLzRjCzWzLtudt0Ixewefpw 3jrA== X-Gm-Message-State: AD7BkJI5yZWlbaOoilzQg4W0lTmOPIlx3HDbtXWbfREzNsVFbC2mRbTNH0Gx0wdKAylsgfh9Nh88cu446Cmkm6Rj MIME-Version: 1.0 X-Received: by 10.107.32.20 with SMTP id g20mr28841002iog.149.1458569067299; Mon, 21 Mar 2016 07:04:27 -0700 (PDT) Received: by 10.107.140.151 with HTTP; Mon, 21 Mar 2016 07:04:27 -0700 (PDT) In-Reply-To: References: Date: Mon, 21 Mar 2016 10:04:27 -0400 Message-ID: Subject: Re: Flink 1.0.0 reading files from multiple directory with wildcards From: Sourigna Phetsarath To: user@flink.apache.org Content-Type: multipart/alternative; boundary=001a1140cd042351fb052e8f93f8 --001a1140cd042351fb052e8f93f8 Content-Type: text/plain; charset=UTF-8 Fabian, I'll try extending InputFormat as you suggested and will create a JIRA issue as well. I also have an AvroGenericRecordInput format class that I would like to contribute once I have time to clean it up and get it into your code base. -Gna On Mon, Mar 21, 2016 at 6:35 AM, Fabian Hueske wrote: > Hi, > > no, this is currently not supported. However, I agree this would be a very > valuable addition to the FileInputFormat. > Would you mind opening a JIRA issue with your suggestions? > > Until this is added to Flink, it can be implemented as a custom > InputFormat based on FileInputFormat by overriding the createInputSplits() > method. > > Best, Fabian > > 2016-03-21 0:11 GMT+01:00 Sourigna Phetsarath > : > >> All, >> >> Do any of the Flink Data Sources support comma separated directories with >> wildcards? >> >> For example: >> >> env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*, >> /data/2016/01/03/*/*") >> >> >> Thanks in advance for any help that you can provide. >> -- >> >> >> *Gna Phetsarath*System Architect // AOL Platforms // Data Services // >> Applied Research Chapter >> 770 Broadway, 5th Floor, New York, NY 10003 >> o: 212.402.4871 // m: 917.373.7363 >> vvmr: 8890237 aim: sphetsarath20 t: @sourigna >> >> * * >> > > -- *Gna Phetsarath*System Architect // AOL Platforms // Data Services // Applied Research Chapter 770 Broadway, 5th Floor, New York, NY 10003 o: 212.402.4871 // m: 917.373.7363 vvmr: 8890237 aim: sphetsarath20 t: @sourigna * * --001a1140cd042351fb052e8f93f8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Fabian,

I'll try extending In= putFormat as you suggested and will create a JIRA issue as well.

I also have an AvroGenericRecordInput format class that I would li= ke to contribute once I have time to clean it up and get it into your code = base.

-Gna
<= br>
On Mon, Mar 21, 2016 at 6:35 AM, Fabian Huesk= e <fhueske@gmail.com> wrote:
Hi,

no, this is currently n= ot supported. However, I agree this would be a very valuable addition to th= e FileInputFormat.
Would you mind opening a JIRA issue with your sugges= tions?

Until this is added to Flink, it can be implemented as = a custom InputFormat based on FileInputFormat by overriding the createInput= Splits() method.

Best, Fabian
<= div class=3D"h5">

= 2016-03-21 0:11 GMT+01:00 Sourigna Phetsarath <gna.phetsarath@tea= maol.com>:
All,

Do any of the Flink Data Sources support comma sep= arated directories with wildcards?

For example:
env.readFile("/data/2016/01/01/*/*,/d= ata/2016/01/02/*/*,/data/2016/01/03/*/*")

Thanks in advance for any help that you can provide.
--
=

Gna Phetsarath
System Architect // AOL Platforms // Da= ta Services // Applied Research Chapter
770 Broadway, 5th Floor, New Yor= k, NY 10003
o: 212.402.4871 // m: 917.373.7363
vvmr:=C2= =A08890237=C2=A0
aim: = sphetsarath20 t: @sourigna

<= /a>





--
=
--001a1140cd042351fb052e8f93f8--