From Anirudh Mallem <anirudh.mal...@247-inc.com>
Subject Reg Checkpoint size using RocksDb
Date Mon, 19 Dec 2016 09:47:37 GMT
I was experimenting with using RocksDb as the state backend for my job and to test its behavior
I modified the socket word count program to store states. I also wrote a RichMapFunction which
stores the states as a ValueState with default value as null.
What the job does basically is, for every word received if the current state is null then
it updates the state with a fixed value say “abc” and in case the state is nonNull then
it is cleared.
So ideally if my input stream has the word “foo” twice then the corresponding state is
first set to “abc” and then cleared at the second “foo”. I see that this behavior
is occurring as expected but the checkpointed size keeps increasing! Is this expected? I believe
the checkpointed size as shown on the dashboard should decrease when some of the states are
cleared right?
In this case if each of the “foo” word come in successive checkpointing intervals then
we should observe rise and one fall in the checkpointing size right? I see the checkpointed
size is increasing in both cases!!!

Any ideas as to what is happening here? My checkpoint duration is 5 secs. Thanks.


