flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Anderson <da...@data-artisans.com>
Subject Re: streaming predictions
Date Tue, 24 Jul 2018 11:29:10 GMT
One option (which I haven't tried myself) would be to somehow get the model
into PMML format, and then use https://github.com/FlinkML/flink-jpmml to
score the model. You could either use another machine learning framework to
train the model (i.e., a framework that directly supports PMML export), or
convert the Flink model into PMML. Since SVMs are fairly simple to
describe, that might not be terribly difficult.

On Mon, Jul 23, 2018 at 4:18 AM Xingcan Cui <xingcanc@gmail.com> wrote:

> Hi Cederic,
>
> If the model is a simple function, you can just load it and make
> predictions using the map/flatMap function in the StreamEnvironment.
>
> But I’m afraid the model trained by Flink-ML should be a “batch job",
> whose predict method takes a Dataset as the parameter and outputs another
> Dataset as the result. That means you cannot easily apply the model on
> streams, at least for now.
>
> There are two options to solve this. (1) Train the dataset using another
> framework to produce a simple function. (2) Adjust your model serving as a
> series of batch jobs.
>
> Hope that helps,
> Xingcan
>
> On Jul 22, 2018, at 8:56 PM, Hequn Cheng <chenghequn@gmail.com> wrote:
>
> Hi Cederic,
>
> I am not familiar with SVM or machine learning but I think we can work it
> out together.
> What problem have you met when you try to implement this function? From my
> point of view, we can rebuild the model in the flatMap function and use it
> to predict the input data. There are some flatMap documents here[1].
>
> Best, Hequn
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/#datastream-transformations
>
>
>
>
>
> On Sun, Jul 22, 2018 at 4:12 PM, Cederic Bosmans <bosmansc6@gmail.com>
> wrote:
>
>> Dear
>>
>> My name is Cederic Bosmans and I am a masters student at the Ghent
>> University (Belgium).
>> I am currently working on my masters dissertation which involves Apache
>> Flink.
>>
>> I want to make predictions in the streaming environment based on a model
>> trained in the batch environment.
>>
>> I trained my SVM-model this way:
>> val svm2 = SVM()
>> svm2.setSeed(1)
>> svm2.fit(trainLV)
>> val testVD = testLV.map(lv => (lv.vector, lv.label))
>> val evalSet = svm2.evaluate(testVD)
>>
>> and saved the model:
>> val modelSvm = svm2.weightsOption.get
>>
>> Then I have an incoming datastream in the streaming environment:
>> dataStream[(Int, Int, Int)]
>> which should be bininary classified using this trained SVM model.
>>
>> Since the predict function does only support DataSet and not DataStream,
>> on stackoverflow a flink contributor mentioned that this should be done
>> using a map/flatMap function.
>> Unfortunately I am not able to work this function out.
>>
>> It would be incredible for me if you could help me a little bit further!
>>
>> Kind regards and thanks in advance
>> Cederic Bosmans
>>
>
>
>

-- 
*David Anderson* | Training Coordinator | data Artisans
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time

Mime
View raw message