gearpump-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (GEARPUMP-33) Message Delivery Guarantee
Date Fri, 15 Apr 2016 05:52:25 GMT

     [ https://issues.apache.org/jira/browse/GEARPUMP-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Manu Zhang updated GEARPUMP-33:
-------------------------------
    Description: 
The original discussions are at [https://github.com/gearpump/gearpump/issues/1528] and [https://github.com/gearpump/gearpump/issues/354].

When a message flows through a stream processing system, the system will try to provide some
guarantee on message delivery From the weakest to strongest, there are.

# At most once delivery
  a message is processed zero or one times. Messages can be lost. 

# At least once delivery
   a message is processed one or more times such that at least one of them succeeds. Messages
can not be lost but can be duplicated.

# Exactly once delivery
  a message is processed exactly once. Messages can neither be lost nor duplicated.

Gearpump tracks message loss between a sender Task and a receiver Task and replays the application
on message loss. If the source is TimeReplayable, then at-least-once delivery can be guaranteed.
In addition, if user state is stored through PersistentState API, then exactly-once delivery
is guaranteed. Otherwise, at-most-once delivery is guaranteed. 

There are several limitations with the current implementation. 

1. If users only require at-most-once delivery, message loss track is not necessary and we
may get better performance without it. 
2. We require user's data source to be TimeReplayable for at-least-once/exactly-once delivery.
It would be better if we provide a TimeReplayable wrapper when user source is not replayable
(e.g. Twitter)
3.  Further, it would be nice if we allow users to switch between the different guarantees
through APIs or dashboard.

This jira is to gather requirements and ideas from the community and users. The real work
will be divided into subtasks and committed step by step. 
  

  was:
The original discussions are at [https://github.com/gearpump/gearpump/issues/1528] and [https://github.com/gearpump/gearpump/issues/354].

When a message flows through a stream processing system, the system will try to provide some
guarantee on message delivery From the weakest to strongest, there are.

# At most once delivery
  a message is processed zero or one times. Messages can be lost. 

# At least once delivery
   a message is processed one or more times such that at least one of them succeeds. Messages
can not be lost but can be duplicated.

# Exactly once delivery
  a message is processed exactly once. Messages can neither be lost nor duplicated.

Gearpump tracks message loss between a sender Task and a receiver Task and replays the application
on message loss. If the source is TimeReplayable, then at-least-once delivery can be guaranteed.
In addition, if user state is stored through PersistentState API, then exactly-once delivery
is guaranteed. Otherwise, at-most-once delivery is guaranteed. 

There are several limitations with the current implementation. 

1. If users only require at-most-once delivery, message loss track is not necessary and we
may get better performance without it. 
2. We require user's data source to be TimeReplayable for at-least-once/exactly-once delivery.
It would be better if we provide a TimeReplayable wrapper when user source is not replayable
(e.g. Twitter)
3.  Further, it will be nice if we allow users to switch between the different guarantees
through APIs or dashboard.

This jira is to gather requirements and ideas from the community and users. The real work
will be divided into subtasks and committed step by step. 
  


> Message Delivery Guarantee
> --------------------------
>
>                 Key: GEARPUMP-33
>                 URL: https://issues.apache.org/jira/browse/GEARPUMP-33
>             Project: Apache Gearpump
>          Issue Type: Improvement
>          Components: streaming
>            Reporter: Manu Zhang
>
> The original discussions are at [https://github.com/gearpump/gearpump/issues/1528] and
[https://github.com/gearpump/gearpump/issues/354].
> When a message flows through a stream processing system, the system will try to provide
some guarantee on message delivery From the weakest to strongest, there are.
> # At most once delivery
>   a message is processed zero or one times. Messages can be lost. 
> # At least once delivery
>    a message is processed one or more times such that at least one of them succeeds.
Messages can not be lost but can be duplicated.
> # Exactly once delivery
>   a message is processed exactly once. Messages can neither be lost nor duplicated.
> Gearpump tracks message loss between a sender Task and a receiver Task and replays the
application on message loss. If the source is TimeReplayable, then at-least-once delivery
can be guaranteed. In addition, if user state is stored through PersistentState API, then
exactly-once delivery is guaranteed. Otherwise, at-most-once delivery is guaranteed. 
> There are several limitations with the current implementation. 
> 1. If users only require at-most-once delivery, message loss track is not necessary and
we may get better performance without it. 
> 2. We require user's data source to be TimeReplayable for at-least-once/exactly-once
delivery. It would be better if we provide a TimeReplayable wrapper when user source is not
replayable (e.g. Twitter)
> 3.  Further, it would be nice if we allow users to switch between the different guarantees
through APIs or dashboard.
> This jira is to gather requirements and ideas from the community and users. The real
work will be divided into subtasks and committed step by step. 
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message