flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: design question
Date Mon, 25 Apr 2016 08:06:22 GMT
Hi,
in the Flink doc there is this:
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#the-rocksdbstatebackend
and
this: RocksDBStateBackend
<https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackend.html>

Cheers,
Aljoscha

On Sun, 24 Apr 2016 at 21:58 Chen Bekor <chen.bekor@gmail.com> wrote:

> cool - can you point me to some docs about how to configure Rocks DB? I
> searched the online docs and found nothing substantial. Also - If I'm using
> HDFS (S3backed ) cluster, how would that effect RocksDB? can I configure it
> to run on optimized SSD etc?
>
> any help is appreciated.
>
>
> On Sun, Apr 24, 2016 at 7:57 AM, John Sherwood <jrs@vt.edu> wrote:
>
>> This sounds like you have some per-key state to keep track of, so the
>> 'correct' way to do it would be to keyBy the guid. I believe that if you
>> run your environment using the Rocks DB state backend you will not OOM
>> regardless of the number of GUIDs that are eventually tracked. Whether
>> flink/stream processing is the most effective way to achieve your goal, I
>> can't say, but I am fairly confident that this particular aspect is not a
>> problem.
>>
>> On Sat, Apr 23, 2016 at 1:13 AM, Chen Bekor <chen.bekor@gmail.com> wrote:
>>
>>> hi all,
>>>
>>> I have a stream of incoming object versions (objects change over time)
>>> and a requirement to fetch from a datastore the last known object version
>>> in order to link it with the id of the new version,  so that I end up with
>>> a linked list of object versions.
>>>
>>> all object versions contain the same guid, so I was thinking about using
>>> flink streaming in order to assure ordering and avoid concurrency / race
>>> conditions in the linkage process (object version might arrive unordered or
>>> may arrive at spikes)
>>>
>>> if I use the object guid as a key for a keyed stream I am concerned I
>>> will end up with millions of windowed streams hence causing OOM.
>>>
>>> what do you think should be the right approach? do you think flink is
>>> the right technology for this task?
>>>
>>
>>
>

Mime
View raw message