flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Rodríguez Hortalá <juan.rodriguez.hort...@gmail.com>
Subject Re: Hardware requirements and learning resources
Date Wed, 02 Sep 2015 18:07:13 GMT
Hi Kostas,

Thanks a lot for your answer. It's nice to know there are more training
videos on their way, they will be on my watch list. I guess you'll be using
the data Artisans channel for the new videos too.



2015-09-02 14:30 GMT+02:00 Kostas Tzoumas <ktzoumas@apache.org>:

> Hi Juan,
> Flink is quite nimble with hardware requirements; people have run it in
> old-ish laptops and also the largest instances available in cloud
> providers. I will let others chime in with more details.
> I am not aware of something along the lines of a cheatsheet that you
> mention. If you actually try to do this, I would love to see it, and it
> might be useful to others as well. Both use similar abstractions at the API
> level (i.e., parallel collections), so if you stay true to the functional
> paradigm and not try to "abuse" the system by exploiting knowledge of its
> internals things should be straightforward. These apply to the batch APIs;
> the streaming API in Flink follows a true streaming paradigm, where you get
> an unbounded stream of records and operators on these streams.
> Funny that you ask about a video for the DataStream slides. There is a
> Flink training happening as we speak, and a video is being recorded right
> now :-) Hopefully it will be made available soon.
> Best,
> Kostas
> On Wed, Sep 2, 2015 at 1:13 PM, Juan Rodríguez Hortalá <
> juan.rodriguez.hortala@gmail.com> wrote:
>> Answering to myself, I have found some nice training material at
>> http://dataartisans.github.io/flink-training. There are even videos at
>> youtube for some of the slides
>>   - http://dataartisans.github.io/flink-training/overview/intro.html
>>     https://www.youtube.com/watch?v=XgC6c4Wiqvs
>>   - http://dataartisans.github.io/flink-training/dataSetBasics/intro.html
>>     https://www.youtube.com/watch?v=0EARqW15dDk
>> The third lecture
>> http://dataartisans.github.io/flink-training/dataSetAdvanced/intro.html
>> more or less corresponds to https://www.youtube.com/watch?v=1yWKZ26NQeU
>> but not exactly, and there are more lessons at
>> http://dataartisans.github.io/flink-training, for stream processing and
>> the table API for which I haven't found a video. Does anyone have pointers
>> to the missing videos?
>> Greetings,
>> Juan
>> 2015-09-02 12:50 GMT+02:00 Juan Rodríguez Hortalá <
>> juan.rodriguez.hortala@gmail.com>:
>>> Hi list,
>>> I'm new to Flink, and I find this project very interesting. I have
>>> experience with Apache Spark, and for I've seen so far I find that Flink
>>> provides an API at a similar abstraction level but based on single record
>>> processing instead of batch processing. I've read in Quora that Flink
>>> extends stream processing to batch processing, while Spark extends batch
>>> processing to streaming. Therefore I find Flink specially attractive for
>>> low latency stream processing. Anyway, I would appreciate if someone could
>>> give some indication about where I could find a list of hardware
>>> requirements for the slave nodes in a Flink cluster. Something along the
>>> lines of https://spark.apache.org/docs/latest/hardware-provisioning.html.
>>> Spark is known for having quite high minimal memory requirements (8GB RAM
>>> and 8 cores minimum), and I was wondering if it is also the case for Flink.
>>> Lower memory requirements would be very interesting for building small
>>> Flink clusters for educational purposes, or for small projects.
>>> Apart from that, I wonder if there is some blog post by the comunity
>>> about transitioning from Spark to Flink. I think it could be interesting,
>>> as there are some similarities in the APIs, but also deep differences in
>>> the underlying approaches. I was thinking in something like Breeze's
>>> cheatsheet comparing its matrix operatations with those available in Matlab
>>> and Numpy
>>> https://github.com/scalanlp/breeze/wiki/Linear-Algebra-Cheat-Sheet, or
>>> like http://rosettacode.org/wiki/Factorial. Just an idea anyway. Also,
>>> any pointer to some online course, book or training for Flink besides the
>>> official programming guides would be much appreciated
>>> Thanks in advance for help
>>> Greetings,
>>> Juan

View raw message