flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Robutti <simone.robu...@radicalbit.io>
Subject Re: Storing JPMML-Model Object as a Variable Closure?
Date Mon, 05 Sep 2016 12:59:02 GMT
I think you could make use of this small component I've developed:
https://gitlab.com/chobeat/Flink-JPMML

It's specifically for using JPMML on Flink. Maybe there's too much stuff
for what you need but you could reuse the code of the Operator to do what
you need.

2016-09-05 14:11 GMT+02:00 Bauss, Julian <Julian.Bauss@bonprix.net>:

> Hi Stephan,
>
>
>
> thanks for your reply!
>
> It seems as if I can only use broadcast variables on DataSet-Operators
> (using myFunc.withBroadcastSet(…))
>
> Is that right?
>
>
>
> I am working on a DataStream, though. Do streams offer similiar
> functionality?
>
>
>
> Best Regards,
>
>
>
> Julian
>
>
>
> *Von:* Stephan Ewen [mailto:sewen@apache.org]
> *Gesendet:* Freitag, 2. September 2016 15:27
> *An:* user@flink.apache.org
> *Betreff:* Re: Storing JPMML-Model Object as a Variable Closure?
>
>
>
> How about using a source and broadcast variable?
>
>
>
> You could write the model to the storage (DFS), the read it with a source
> and use a broadcast variable to send it to all tasks.
>
> A single record can be very large, so it should work even if your model is
> quite big.
>
>
>
> Does that sound feasible?
>
>
>
> In future versions of flink, you may be able to skip the "write to DFS"
> step and simply have the model in a collection source (when large RPC
> messages are supported).
>
>
>
> Best,
>
> Stephan
>
>
>
>
>
>
>
> On Fri, Sep 2, 2016 at 11:20 AM, Bauss, Julian <Julian.Bauss@bonprix.net>
> wrote:
>
> Hello Everybody,
>
>
>
> I’m currently refactoring some code and am looking for a better
> alternative to handle
>
> JPMML-Models in data streams. At the moment the flink job I’m working on
> references a model-object
>
> as a Singleton which I want to change because static references tend to
> cause problems in distributed systems.
>
>
>
> I thought about handing the model-object to the function that uses it as a
> variable closure. The object
>
> can be between 16MB and 250MB in size (depending on the depth of the
> decision tree).
>
>
>
> According to https://cwiki.apache.org/confluence/display/FLINK/
> Variables+Closures+vs.+Broadcast+Variables that’s way too large though.
>
> Are there any viable alternatives or would this be the „right way“ to
> handle this situation?
>
>
>
> Best Regards,
>
>
>
> Julian
>
>
> ************************************************************
> **************************************************
>
> bonprix Handelsgesellschaft mbH
> Sitz der Gesellschaft: Hamburg
>
> Geschäftsführung:
> Dr. Marcus Ackermann (Vorsitzender)
> Dr. Kai Heck
> Rien Jansen
> Markus Fuchshofen
> Beiratsvorsitzender: Alexander Birken
>
> Handelsregister AG Hamburg HR B 36 455
>
> Adresse:
>
> bonprix Handelsgesellschaft mbH
>
> Haldesdorfer Str. 61
> 22179 Hamburg
>
> Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
> Informationen.
> Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich
> erhalten haben,
> informieren Sie bitte sofort den Absender und vernichten Sie diese Mail.
> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist
> nicht gestattet.
>
> This e-mail may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received the e-mail in
> error)
> please notify the sender immediately and delete this e-mail. Any
> unauthorized copying,
> disclosure or distribution of the material in this e-mail is strictly
> forbidden.
>
> ************************************************************
> **************************************************
>
>
>
>
> ************************************************************
> **************************************************
>
> bonprix Handelsgesellschaft mbH
> Sitz der Gesellschaft: Hamburg
>
> Geschäftsführung:
> Dr. Marcus Ackermann (Vorsitzender)
> Dr. Kai Heck
> Rien Jansen
> Markus Fuchshofen
> Beiratsvorsitzender: Alexander Birken
>
> Handelsregister AG Hamburg HR B 36 455
>
> Adresse:
>
> bonprix Handelsgesellschaft mbH
>
> Haldesdorfer Str. 61
> 22179 Hamburg
>
> Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
> Informationen.
> Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich
> erhalten haben,
> informieren Sie bitte sofort den Absender und vernichten Sie diese Mail.
> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist
> nicht gestattet.
>
> This e-mail may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received the e-mail in
> error)
> please notify the sender immediately and delete this e-mail. Any
> unauthorized copying,
> disclosure or distribution of the material in this e-mail is strictly
> forbidden.
>
> ************************************************************
> **************************************************
>
>

Mime
View raw message