flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: FLink Streaming - Parallelize ArrayList
Date Tue, 29 Sep 2015 08:00:22 GMT
Hi,
I think for what you are trying a Connected FlatMap would be best suited.
Using this, you can have an operation that has two input streams:

input1 = env.socketStream(...)
input2 = env.socketStream(...)

result = input1.connect(input2)
    .flatMap(new CoFlatMapFunction<IN1, IN2, OUT> {
       void flatMap1(...) {...}
       void flatMap2(...) {...}
    })

here, the flatMap1 will get elements from the first input stream while
flatMap2 will get elements from the second input stream. You can now
internally keep a list of elements from the first input. You have to be
careful, however, to not let the list get too large. Otherwise, you might
blow your JVM with an OutOfMemory exception.

Cheers,
Aljoscha

On Tue, 29 Sep 2015 at 04:38 defstat <defstat@gmail.com> wrote:

> Hi all,
>
> Is there a way to either Broadcast or Parallelize  an ArrayList of custom
> objects?
>
> The application is as follows:
>
> I have 2 streams taken from two sockets. The first stream contains vectors
> that should populate one list of vectors by deciding whether an arriving
> vector is part of the list, after calculations made on the existing list
> vectors and the new one.
>
> The other stream is also containing vectors that use the above list, and
> sorts it using the info contained in the incoming vectors.
>
> I tried to solve this using
>
> private static final ArrayList<Vector> aSetOfVectors = new
> ArrayList<Vector>();
>
> vectorsFirstStream.fold(aSetOfVectors , new FoldFunction<Vector,
> ArrayList&lt;Vector>>() {
>  .... (Calculations and new aSetOfVectors produced) ......
>
> return aSetOfVectors;
> }
>
> vectorsSecondStream.fold(aSetOfVectors , new FoldFunction<Vector,
> ArrayList&lt;Vector>>() {
>  .... (Calculations and sorted vector list produced) ......
> }
>
> I think that that approach could work only locally. Is there a way to
> parallelize the above?
>
> Thank you in advance
>
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FLink-Streaming-Parallelize-ArrayList-tp2957.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>

Mime
View raw message