apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Jotwani <mo...@datatorrent.com>
Subject Re: proposal for reconciled JDBC output operator
Date Tue, 29 Mar 2016 04:50:39 GMT
Dear Ashwin,

The approach sounds good. I am assuming that this will be done for all the
output data stores and not limited to JDBC.

+1

Regards,
Mohit

On Tue, Mar 29, 2016 at 5:17 AM, Ashwin Chandra Putta <
ashwinchandrap@gmail.com> wrote:

> There are many use cases in which we are writing tuples to external system
> using JDBC etc. There are instances when the external system might be slow
> and down for some time. In those cases, the current implementation of jdbc
> output operators fail and restart until the external system is up again.
> Meanwhile, the DAG is slowed down by this operator. To deal with such
> scenarios, we should write the output in a reconciled fashion where the
> reconciler thread is writing at the pace of external system. We should also
> provide an ability to spool the data to disk when the external system is
> down or the output operators queue is full.
>
> Here are the proposed features for the output operator.
>
> 1. Write to external system in a separate reconciler thread.
> 2. Queue the tuples in memory for reconciler thread to consume.
> 3. Spool the incoming tuples to hdfs using a WAL when the queue is full.
> 4. Read from WAL and write to queue as queue is being consumed.
> 5. When external system is able to consume as fast as incoming throughput,
> WAL is not written. The queue will just buffer the tuples before writing to
> external system.
>
> Here is the JIRA: https://issues.apache.org/jira/browse/APEXMALHAR-2037
>
> Please let me know if you have any feedback on the design.
>
> --
>
> Regards,
> Ashwin.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message