incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@yahoo-inc.com>
Subject Re: Thoughts on adding guaranteed message processing
Date Tue, 07 Aug 2012 13:53:06 GMT
I didn't mean to suggest a different way, I was trying to understand the definition of exactly-once.
As for the use case Benjamin has posted, he says that not even a single tag scan can be lost,
but can we guarantee that events are reliably delivered at-least-once with S4?

-Flavio

On Aug 7, 2012, at 7:38 AM, Karthik Kambatla wrote:

> Given that nodes crash, it seems essential to build reliability like it is
> in lower layers - think TCP. One trivial approach could be to use
> monotonically increasing sequence numbers to identify events in a stream.
> 
>   1. Event ordering: Hold events until all the previous sequence numbers
>   have been received.
>   2. Exactly-once: If a sequence number is smaller than previous one, it
>   is a duplicate.
>   3. Fault-tolerance: Store the latest sequence number along with the
>   checkpoint, replay events from there onwards.
> 
> This, of course, comes at a performance overhead and should be optional.
> 
> As I said, this is the first approach that comes to mind. It is indeed an
> interesting problem, and I feel we should not need to re-do stuff that is
> done at lower layers.
> 
> Please suggest improvements/alternatives as you see fit.
> 
> Thanks
> Karthik
> 
> On Mon, Aug 6, 2012 at 10:01 PM, Flavio Junqueira <fpj@yahoo-inc.com> wrote:
> 
>> 
>> On Aug 6, 2012, at 10:11 PM, Karthik Kambatla wrote:
>> 
>> Flavio - it is indeed tricky to offer exactly-once semantics. My
>> understanding is that the underlying comm-layer could filter out subsequent
>> duplicate events; however, we need to sacrifice ordering.
>> 
>> 
>> I was also thinking that if a node crash and we recover from checkpoints,
>> we may end up having messages applied twice.
>> 
>> -Flavio
>> 
>> Thanks
>> Karthik
>> 
>> On Mon, Aug 6, 2012 at 6:43 AM, "Benjamin Süß" <Gothic13@gmx.de> wrote:
>> 
>>> Hi Matthieu,
>>> 
>>> thank you for your reply. I had a specific use case in mind, indeed:
>>> 
>>> I am trying to track RFID tags in distributed systems. This means, that
>>> not even a single tag scan may get lost. And of course, none are to be sent
>>> twice or even more often as this would heavily confuse any surveillance
>>> routines I am going to implement.
>>> 
>>> Regarding your answers, especially point 3, I do not think this can be
>>> done with S4 at the moment, can it?
>>> 
>>> Regards,
>>> Benjamin
>>> 
>>> -------- Original-Nachricht --------
>>>> Datum: Tue, 31 Jul 2012 17:30:27 +0200
>>>> Von: Matthieu Morel <mmorel@apache.org>
>>>> An: s4-user@incubator.apache.org
>>>> Betreff: Re: Thoughts on adding guaranteed message processing
>>> 
>>>> On 7/31/12 2:54 PM, "Benjamin Süß" wrote:
>>>>> Hi there,
>>>>> 
>>>>> it is stated in several places that S4 does not include guaranteed
>>>> one-time message processing. So my question is: are there currently any
>>> plans on
>>>> adding this to S4? Or is it certain this is not going to happen? If
>>> there
>>>> are any plans on this, can I find further information somewhere?
>>>>> 
>>>> 
>>>> There are typically 3 requirements for guaranteeing one-time message
>>>> processing:
>>>> 
>>>> 1. reliable communication channels
>>>> 
>>>> 2. replayable input stream: you need an upstream component that is able
>>>> to store/bufferize the whole stream and replay on demand.
>>>> 
>>>> 3. tracking of messages, using some sort of piggybacking, possibly
>>>> requiring manual input from the user.
>>>> 
>>>> 
>>>> In S4 0.5.0, we already address 1. by providing communications through
>>>> TCP by default. Requirement 2. is quite straightforward to implement, by
>>>> adding some machinery to connect to a component such as Apache Kafka for
>>>> instance. We are considering options for 3.
>>>> 
>>>> 
>>>> Do you have a specific use case in mind?
>>>> 
>>>> Regards,
>>>> 
>>>> Matthieu
>>>> 
>>> 
>> 
>> 
>> 


Mime
View raw message