apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaurav Gupta <gau...@datatorrent.com>
Subject Re: Apex-119 - Distributed Operator design discussion
Date Thu, 12 Nov 2015 06:57:48 GMT
Is there a way to disable/ enable this feature? Synchronizing all the partitions and bringing
all the partitions to same common checkpoint post failure would affect performance. 

Thanks
- Gaurav

> On Nov 11, 2015, at 10:50 PM, Thomas Weise <thomas@datatorrent.com> wrote:
> 
> I would like to better understand the target use cases. This will also help
> to analyze trade-offs.
> 
> The proposal of synchronizing all partitions at a window boundary affects
> scalability, adds latency and dictates reset of all partitions on operator
> failure.
> 
> There are different levels of support for such "distributed data
> structure". For example, limiting each partition to single writer and
> version based reads would allow for relaxation of synchronization needs.
> Again, goals and pros and cons of different approaches need to be discussed.
> 
> 
> On Tue, Nov 10, 2015 at 2:34 PM, Sandesh Hegde <sandesh@datatorrent.com>
> wrote:
> 
>> Hello All,
>> 
>> Tim & I started working on Apex 119
>> <https://malhar.atlassian.net/browse/APEX-119> and came up with the
>> following design document.
>> 
>> Idea is to treat all the partitions of an operator as a single unit, they
>> all will work on the same window and if one of them fails all the
>> partitions are brought back to common checkpoint.
>> 
>> You can comment on the document, once it is finalized, we will attach the
>> document to Jira.
>> 
>> 
>> https://docs.google.com/document/d/1Rau76WxAycyN9vQqP2bqDWZAwLw0u23xSh0_5fQ1980/edit?usp=sharing
>> 
>> Thanks
>> Sandesh
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message