flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gábor Hermann <m...@gaborhermann.com>
Subject Re: Machine Learning on Flink - Next steps
Date Fri, 10 Mar 2017 09:52:59 GMT
Hey all,

Sorry for the bit late response.

I'd like to work on
- Offline learning with Streaming API
- Low-latency prediction serving

I would drop the batch API ML because of past experience with lack of 
support, and online learning because the lack of use-cases.

I completely agree with Kate that offline learning should be supported, 
but given Flink's resources I prefer using the streaming API as Roberto 
suggested. Also, full model lifecycle (or end-to-end ML) could be more 
easily supported in one system (one API). Connecting Flink Batch with 
Flink Streaming is currently cumbersome (although side inputs [1] might 
help). In my opinion, a crucial part of end-to-end ML is low-latency 
predictions.

As another direction, we could integrate Flink Streaming API with other 
projects (such as Prediction IO). However, I believe it's better to 
first evaluate the capabilities and drawbacks of the streaming API with 
some prototype of using Flink Streaming for some ML task. Otherwise we 
could run into critical issues just as the System ML integration with 
e.g. caching. These issues makes the integration of Batch API with other 
ML projects practically infeasible.

I've already been experimenting with offline learning with the Streaming 
API. Hopefully, I can share some initial performance results next week 
on matrix factorization. Naturally, I've run into issues. E.g. I could 
only mark the end of input with some hacks, because this is not needed 
at a streaming job consuming input forever. AFAIK, this would be 
resolved by side inputs [1].

@Theodore:
+1 for doing the prototype project(s) separately the main Flink 
repository. Although, I would strongly suggest to follow Flink 
development guidelines as closely as possible. As another note, there is 
already a GitHub organization for Flink related projects [2], but it 
seems like it has not been used much.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API
[2] https://github.com/project-flink

On 2017-03-04 08:44, Roberto Bentivoglio wrote:

> Hi All,
>
> I'd like to start working on:
>   - Offline learning with Streaming API
>   - Online learning
>
> I think also that using a new organisation on github, as Theodore propsed,
> to keep an initial indipendency to speed up the prototyping and development
> phases it's really interesting.
>
> I totally agree with Katherin, we need offline learning, but my opinion is
> that it will be more straightforward to fix the streaming issues than batch
> issues because we will have more support on that by the Flink community.
>
> Thanks and have a nice weekend,
> Roberto
>
> On 3 March 2017 at 20:20, amir bahmanyari <amirtousa@yahoo.com.invalid>
> wrote:
>
>> Great points to start:    - Online learning
>>    - Offline learning with the streaming API
>>
>> Thanks + have a great weekend.
>>
>>        From: Katherin Eri <katherinmail@gmail.com>
>>   To: dev@flink.apache.org
>>   Sent: Friday, March 3, 2017 7:41 AM
>>   Subject: Re: Machine Learning on Flink - Next steps
>>
>> Thank you, Theodore.
>>
>> Shortly speaking I vote for:
>> 1) Online learning
>> 2) Low-latency prediction serving -> Offline learning with the batch API
>>
>> In details:
>> 1) If streaming is strong side of Flink lets use it, and try to support
>> some online learning or light weight inmemory learning algorithms. Try to
>> build pipeline for them.
>>
>> 2) I think that Flink should be part of production ecosystem, and if now
>> productions require ML support, multiple models deployment and so on, we
>> should serve this. But in my opinion we shouldn’t compete with such
>> projects like PredictionIO, but serve them, to be an execution core. But
>> that means a lot:
>>
>> a. Offline training should be supported, because typically most of ML algs
>> are for offline training.
>> b. Model lifecycle should be supported:
>> ETL+transformation+training+scoring+exploitation quality monitoring
>>
>> I understand that batch world is full of competitors, but for me that
>> doesn’t mean that batch should be ignored. I think that separated
>> streaming/batching applications causes additional deployment and
>> exploitation overhead which typically tried to be avoided. That means that
>> we should attract community to this problem in my opinion.
>>
>>
>> пт, 3 мар. 2017 г. в 15:34, Theodore Vasiloudis <
>> theodoros.vasiloudis@gmail.com>:
>>
>> Hello all,
>>
>>  From our previous discussion started by Stavros, we decided to start a
>> planning document [1]
>> to figure out possible next steps for ML on Flink.
>>
>> Our concerns where mainly ensuring active development while satisfying the
>> needs of
>> the community.
>>
>> We have listed a number of proposals for future work in the document. In
>> short they are:
>>
>>    - Offline learning with the batch API
>>    - Online learning
>>    - Offline learning with the streaming API
>>    - Low-latency prediction serving
>>
>> I saw there is a number of people willing to work on ML for Flink, but the
>> truth is that we cannot
>> cover all of these suggestions without fragmenting the development too
>> much.
>>
>> So my recommendation is to pick out 2 of these options, create design
>> documents and build prototypes for each library.
>> We can then assess their viability and together with the community decide
>> if we should try
>> to include one (or both) of them in the main Flink distribution.
>>
>> So I invite people to express their opinion about which task they would be
>> willing to contribute
>> and hopefully we can settle on two of these options.
>>
>> Once that is done we can decide how we do the actual work. Since this is
>> highly experimental
>> I would suggest we work on repositories where we have complete control.
>>
>> For that purpose I have created an organization [2] on Github which we can
>> use to create repositories and teams that work on them in an organized
>> manner.
>> Once enough work has accumulated we can start discussing contributing the
>> code
>> to the main distribution.
>>
>> Regards,
>> Theodore
>>
>> [1]
>> https://docs.google.com/document/d/1afQbvZBTV15qF3vobVWUjxQc49h3U
>> d06MIRhahtJ6dw/
>> [2] https://github.com/flinkml
>>
>> --
>>
>> *Yours faithfully, *
>>
>> *Kate Eri.*
>>
>>
>>
>
>


Mime
View raw message