flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Why TimerService interface in ProcessFunction doesn't have deleteEventTimeTimer
Date Mon, 24 Apr 2017 09:09:38 GMT
Here are the thoughts why I advocated to not expose the "delete" initially:

  (1) The original heap timer structure was not very sophisticated and
could not support efficient timer deletion (as Gyula indicated). As a core
rule in large scale systems: Never expose an operation you cannot do
efficiently, hence there should be no delete operation until there is a
better heap timer structure.

  (2) In general, we need timers to be able to go out-of core as well
(there may be many many timers in certain cases). That's why we picked a
RocksDB Timer Service for the large scale timers.

  (3) An additional challenge is that timers are per key and need to be
stored in a key-partitioned fashion on checkpoint. Any implementation needs
to handle that as well (so we probably need an extension of the classical
time wheel structures)

If all this is solved and supports efficient deletes, then we can add that
operation again, in my opinion.



On Sat, Apr 22, 2017 at 4:43 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Logged FLINK-6359, referring to this thread.
>
> FYI
>
> On Sat, Apr 22, 2017 at 1:10 AM, Gyula Fóra <gyfora@apache.org> wrote:
>
>> Hi,
>>
>> I am not familiar with this data structure, I will try to read up on it.
>> But it looks interesting.
>>
>> For some reference, some links:
>>
>> https://www.confluent.io/blog/apache-kafka-purgatory-hierarc
>> hical-timing-wheels/
>>
>> http://www.cs.columbia.edu/~nahum/w6998/papers/sosp87-timing-wheels.pdf
>>
>> Cheers,
>> Gyula
>>
>> On Sat, Apr 22, 2017, 00:35 Ted Yu <yuzhihong@gmail.com> wrote:
>>
>>> Benjamin has an implementation for Hierarchical Timing Wheels (Apache
>>> License) :
>>>
>>> https://github.com/ben-manes/caffeine/blob/master/caffeine/s
>>> rc/main/java/com/github/benmanes/caffeine/cache/TimerWheel.java
>>>
>>> If there is some interest, we can port the above over.
>>>
>>> Cheers
>>>
>>> On Fri, Apr 21, 2017 at 12:44 PM, Gyula Fóra <gyfora@apache.org> wrote:
>>>
>>>> The timer will actually fire and will be removed at the original time,
>>>> but we don't trigger any action on it. We also remove the tombstone state
>>>> afterwards.
>>>>
>>>> So we use more memory yes depending on the length and number of timers
>>>> that were deleted. But it is eventually cleaned up.
>>>>
>>>> Gyula
>>>>
>>>> Ted Yu <yuzhihong@gmail.com> ezt írta (időpont: 2017. ápr. 21.,
P,
>>>> 21:38):
>>>>
>>>>> A bit curious: wouldn't using "tombstone" markers constitute some
>>>>> memory leak (since Timers are not released) ?
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Fri, Apr 21, 2017 at 12:23 PM, Gyula Fóra <gyfora@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> I thought I would drop my opinion here maybe it is relevant.
>>>>>>
>>>>>> We have used the Flink internal timer implementation in many of our
>>>>>> production applications, this supports the Timer deletion but the
deletion
>>>>>> actually turned out to be a huge performance bottleneck because of
the bad
>>>>>> deletion performance of the Priority queue.
>>>>>>
>>>>>> In many of our cases deletion could have been avoided by some more
>>>>>> clever registration/firing logic and we also ended up completely
avoiding
>>>>>> deletion and instead using "tombstone" markers by setting a flag
in the
>>>>>> state which timers not to fire when they actually trigger.
>>>>>>
>>>>>> Gyula
>>>>>>
>>>>>>
>>>>>>
>>>>>> Aljoscha Krettek <aljoscha@apache.org> ezt írta (időpont:
2017. ápr.
>>>>>> 21., P, 14:47):
>>>>>>
>>>>>>> Hi,
>>>>>>> the reasoning behind the limited user facing API was that we
were
>>>>>>> (are) not sure whether we would be able to support efficient
deletion of
>>>>>>> timers for different ways of storing timers.
>>>>>>>
>>>>>>> @Stephan, If I remember correctly you were the strongest advocate
>>>>>>> for not allowing timer deletion. What’s your thinking on this?
There was
>>>>>>> also a quick discussion on https://issues.apache.org/j
>>>>>>> ira/browse/FLINK-3089 where Xiaogang explained that the (new,
not
>>>>>>> merged) RocksDB based timers would have efficient timer deletion.
>>>>>>>
>>>>>>> Best,
>>>>>>> Aljoscha
>>>>>>>
>>>>>>> On 20. Apr 2017, at 11:56, Jagadish Bihani <jagadish@helpshift.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> I am working on a use case where I want to start a timer for
a given
>>>>>>> event type and when that timer expires it will perform certain
action. This
>>>>>>> can be done using Process Function.
>>>>>>>
>>>>>>> But I also want to cancel scheduled timer in case of some other
>>>>>>> types of events. I also checked the implementation of
>>>>>>> HeapInternalTimerService which implements InternalTimerService
interface
>>>>>>> has those implementations already. Also SimpleTimerService which
overrides
>>>>>>> TimerService also uses InternalTimerService and simply passes
>>>>>>> VoidNamespace.INSTANCE.
>>>>>>>
>>>>>>> So in a way we are using InternalTimerService interface's
>>>>>>> implementations everywhere.
>>>>>>>
>>>>>>> So what is the reason that ProcessFunction.Context uses
>>>>>>> TimerService? Any reason 'deleteEventTimeTimer' is not exposed
to users? If
>>>>>>> I want to use the deleteEvent functionality how should I go about
it?
>>>>>>>
>>>>>>> --
>>>>>>> Thanks and Regards,
>>>>>>> Jagadish Bihani
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>

Mime
View raw message