flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mustafa Elbehery <elbeherymust...@gmail.com>
Subject Re: Tweets Custom Input Format
Date Fri, 27 Feb 2015 09:46:44 GMT
Actually I am reading "How to contribute" now to push the code. Its working
and tested locally and on the cluster, and i have used it for an ETL.

The structure as follow :-

Java Pojos for the tweet object, and the nested objects.  Parser class
using event-driven approach, and the SimpleTweetInputFormat itself.

Would you guide me how to push the code, just to save sometime :)


On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger <rmetzger@apache.org>
wrote:

> Hi,
>
> cool! Can you generalize the input format to read JSON into an arbitrary
> POJO?
>
> It would be great if you could contribute the InputFormat into the
> "flink-contrib" module. I've seen many users reading JSON data with Flink,
> so its good to have a standard solution for that.
> If you want you can add the "Tweet into POJO" as an example into
> flink-contrib.
>
> On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
> elbeherymustafa@gmail.com> wrote:
>
> > Hi,
> >
> > I am really sorry for being so late, it was a whole month of projects and
> > examination, I was really busy.
> >
> > @Robert, it is IF for reading tweet into Pojo. I use an event-driven
> > parser, I retrieve most of the tweet into Java Pojos, it was tested on
> 1TB
> > dataset, for a Flink ETL job, and the performance was pretty good.
> >
> >
> >
> > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <rmetzger@apache.org>
> > wrote:
> >
> > > Hey,
> > >
> > > is it a input format for reading JSON data or an IF for reading tweets
> in
> > > some format into a pojo?
> > >
> > > I think a JSON Input Format would be something very useful for our
> users.
> > > Maybe you can add that and use the Tweet IF as a concrete example for
> > that?
> > > Do you have a preview of the code somewhere?
> > >
> > > Best,
> > > Robert
> > >
> > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <fhueske@gmail.com>
> > wrote:
> > >
> > > > Hi Mustafa,
> > > >
> > > > that would be a nice contribution!
> > > >
> > > > We are currently discussing how to add "non-core" API features into
> > Flink
> > > > [1].
> > > > I will move this discussion onto the mailing list to decide where to
> > add
> > > > cool add-ons like yours.
> > > >
> > > > Cheers, Fabian
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
> > > >
> > > > 2015-01-23 20:42 GMT+01:00 Henry Saputra <henry.saputra@gmail.com>:
> > > >
> > > > > Contributions are welcomed!
> > > > >
> > > > > Here is the link on how to contribute to Apache Flink:
> > > > > http://flink.apache.org/how-to-contribute.html
> > > > >
> > > > > You can start by creating JIRA ticket [1] to help describe what you
> > > > > wanted to do and to get feedback from community.
> > > > >
> > > > >
> > > > > - Henry
> > > > >
> > > > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> > > > >
> > > > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> > > > > <elbeherymustafa@gmail.com> wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I have created a custom InputFormat for tweets on Flink, based
on
> > > > > > JSON-Simple event driven parser. I would like to contribute
my
> work
> > > > into
> > > > > > Flink,
> > > > > >
> > > > > > Regards.
> > > > > >
> > > > > > --
> > > > > > Mustafa Elbehery
> > > > > > EIT ICT Labs Master School <
> > > > http://www.masterschool.eitictlabs.eu/home/>
> > > > > > +49(0)15218676094
> > > > > > skype: mustafaelbehery87
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Mustafa Elbehery
> > EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> > +49(0)15218676094
> > skype: mustafaelbehery87
> >
>



-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15218676094
skype: mustafaelbehery87

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message