apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandni Singh <chan...@datatorrent.com>
Subject Re: Apex-119 - Distributed Operator design discussion
Date Sun, 29 Nov 2015 03:42:28 GMT
Hey Tim/Sandesh,

While searching for a doc, I stumbled on the document of Chord protocol.
Don't know how much relevant is this for this project but just wanted to
share.
Here is the doc link.
https://docs.google.com/document/d/12iOWPaA82g3JahjUflEyvirQA4_hL8SrGyit_Iunobk/edit

Also it is a standard protocol so you can look it up and will find more
information.

Thanks,
Chandni

On Fri, Nov 13, 2015 at 10:55 AM, Timothy Farkas <tim@datatorrent.com>
wrote:

> Sandesh and I have created some slides outlining some more possible design
> approaches along with their pros and cons.
>
>
> https://docs.google.com/presentation/d/1-gWwwq4Dd7g9Mai7XLlzA7R_F9nMqEWg3IKMD3OMbYE/edit?usp=sharing
>
> Please review and comment
>
> Thanks,
> Tim
>
> On Wed, Nov 11, 2015 at 11:20 PM, Amol Kekre <amol@datatorrent.com> wrote:
>
> > This feature should be false by default. That way it will need to be an
> > explicit user ask (attribute?) and then on degradation in performance
> etc.
> > is user chosen.
> >
> > Amol
> >
> >
> > On Wed, Nov 11, 2015 at 10:57 PM, Gaurav Gupta <gaurav@datatorrent.com>
> > wrote:
> >
> > > Is there a way to disable/ enable this feature? Synchronizing all the
> > > partitions and bringing all the partitions to same common checkpoint
> post
> > > failure would affect performance.
> > >
> > > Thanks
> > > - Gaurav
> > >
> > > > On Nov 11, 2015, at 10:50 PM, Thomas Weise <thomas@datatorrent.com>
> > > wrote:
> > > >
> > > > I would like to better understand the target use cases. This will
> also
> > > help
> > > > to analyze trade-offs.
> > > >
> > > > The proposal of synchronizing all partitions at a window boundary
> > affects
> > > > scalability, adds latency and dictates reset of all partitions on
> > > operator
> > > > failure.
> > > >
> > > > There are different levels of support for such "distributed data
> > > > structure". For example, limiting each partition to single writer and
> > > > version based reads would allow for relaxation of synchronization
> > needs.
> > > > Again, goals and pros and cons of different approaches need to be
> > > discussed.
> > > >
> > > >
> > > > On Tue, Nov 10, 2015 at 2:34 PM, Sandesh Hegde <
> > sandesh@datatorrent.com>
> > > > wrote:
> > > >
> > > >> Hello All,
> > > >>
> > > >> Tim & I started working on Apex 119
> > > >> <https://malhar.atlassian.net/browse/APEX-119> and came up with
the
> > > >> following design document.
> > > >>
> > > >> Idea is to treat all the partitions of an operator as a single unit,
> > > they
> > > >> all will work on the same window and if one of them fails all the
> > > >> partitions are brought back to common checkpoint.
> > > >>
> > > >> You can comment on the document, once it is finalized, we will
> attach
> > > the
> > > >> document to Jira.
> > > >>
> > > >>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1Rau76WxAycyN9vQqP2bqDWZAwLw0u23xSh0_5fQ1980/edit?usp=sharing
> > > >>
> > > >> Thanks
> > > >> Sandesh
> > > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message