flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rong Rong <walter...@gmail.com>
Subject Re: What's the advantage of using BroadcastState?
Date Sun, 19 Aug 2018 15:30:07 GMT
Hi Paul,

To add to Hequn's answer. Broadcast state can typically be used as "a
low-throughput stream containing a set of rules which we want to evaluate
against all elements coming from another stream" [1]
So to add to the difference list is: whether it is "broadcast" across all
keys if processing a keyed stream. This is typically when it is not
possible to derive same key field using KeySelector in CoStream.
Another additional difference is performance: BroadcastStream is "stored
locally and is used to process all incoming elements on the other stream"
thus requires to carefully manage the size of the BroadcastStream.


On Sun, Aug 19, 2018 at 1:40 AM Hequn Cheng <chenghequn@gmail.com> wrote:

> Hi Paul,
> There are some differences:
> 1. The BroadcastStream can broadcast data for you, i.e, data will be
> broadcasted to all downstream tasks automatically.
> 2. To guarantee that the contents in the Broadcast State are the same
> across all parallel instances of our operator, read-write access is only
> given to the broadcast side
> 3. For BroadcastState, flink guarantees that upon restoring/rescaling
> there will be no duplicates and no missing data. In case of recovery with
> the same or smaller parallelism, each task reads its checkpointed state.
> Upon scaling up, each task reads its own state, and the remaining tasks
> (p_new-p_old) read checkpoints of previous tasks in a round-robin manner.
> While MapState doesn't have such abilities.
> Best, Hequn
> On Sun, Aug 19, 2018 at 11:18 AM, Paul Lam <paullin3280@gmail.com> wrote:
>> Hi,
>> AFAIK, the difference between a BroadcastStream and a normal DataStream
>> is that the BroadcastStream is with a BroadcastState, but it seems that the
>> functionality of BroadcastState can also be achieved by MapState in a
>> CoMapFunction or something since the control stream is still broadcasted
>> without being turned into BroadcastStream. So, I’m wondering what’s the
>> advantage of using BroadcastState? Thanks a lot!
>> Best Regards,
>> Paul Lam

View raw message