I'm currently playing around with some machine learning algorithms in Flink streaming.
I have an input stream that I partition by key and then do a map on each of the keys, feeding a model and producing a prediction output. Periodically each operator needs to send model updates to all other operators.
What is the best way to implement the structure?
My current idea is to use the CoMap function as operator. The first stream is the raw data the second stream the model updates which I could just broadcast from the iterative stream. My problem right now is that I need the CoMap to basically have 2 Streams as output the model updates and the prediction results.
I could write a wrapper class containing both output types but that would require me to separate them afterwards. This feels very clunky, is there a better way of dealing with this?