apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Farkas <...@datatorrent.com>
Subject Re: Database Output Operator Improvements
Date Thu, 17 Dec 2015 19:17:16 GMT
Thanks guys, I remember now that that was done for Hive. Could someone take
a look at implementing it for Cassandra Output?

On Thu, Dec 17, 2015 at 11:13 AM, Pramod Immaneni <pramod@datatorrent.com>
wrote:

> Tim we have a pattern for this called Reconciler that Gaurav has also
> mentioned. There are some examples for it in Malhar
>
> On Thu, Dec 17, 2015 at 9:47 AM, Timothy Farkas <tim@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > One of our users is outputting to Cassandra, but they want to handle a
> > Cassandra failure or Cassandra down time gracefully from an output
> > operator. Currently a lot of our database operators will just fail and
> > redeploy continually until the database comes back. This is a bad idea
> for
> > a couple of reasons:
> >
> > 1 - We rely on buffer server spooling to prevent data loss. If the
> database
> > is down for a long time (several hours or a day) we may run out of space
> to
> > spool for buffer server since it spools to local disk, and data is purged
> > only after a window is committed. Furthermore this buffer server problem
> > will exist for all the Streaming Containers in the dag, not just the one
> > immediately upstream from the output operator, since data is spooled to
> > disk for all operators and only removed for windows once a window is
> > committed.
> >
> > 2 - If there is another failure further upstream in the dag, upstream
> > operators will be redeployed to a checkpoint less than or equal to the
> > checkpoint of the database operator in the At leas once case. This could
> > mean redoing several hours or a day worth of computation.
> >
> > We should support a mechanism to detect when the connection to a database
> > is lost and then spool to hdfs using a WAL, and then write the contents
> of
> > the WAL into the database once it comes back online. This will save the
> > local disk space of all the nodes used in the dag and allow it to be used
> > for only the data being output to the output operator.
> >
> > Ticket here if anyone is interested in working on it:
> >
> > https://malhar.atlassian.net/browse/MLHR-1951
> >
> > Thanks,
> > Tim
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message