apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Farkas <timothytiborfar...@gmail.com>
Subject Re: Partitionable,Idempotent & Pollable JDBCInput
Date Tue, 12 Jul 2016 20:26:48 GMT
Nice work Dev +1 for merging

On Tue, Jul 12, 2016 at 1:21 PM, Ashwin Chandra Putta <
ashwinchandrap@gmail.com> wrote:

> Finally we have a robust jdbc polling input operator. Since it is
> @evolving, we can make improvements over time as folks start using this
> operator.
>
> +1 for merging the PR.
>
> Regards,
> Ashwin.
>
> On Tue, Jul 12, 2016 at 11:17 AM, Devendra Tagare <
> devendrat@datatorrent.com
> > wrote:
>
> > All,
> >
> > We have created a JDBCPollInputOperator with the below features,
> >
> > 1. poll from external jdbc store asynchronously in the input operator.
> > 2. polling frequency and batch size are configurable.
> > 3.User can specify the polling query
> > 4.User can specify the columns to fetch as a part of the result set.
> > 5. It is idempotent and partition-able.
> > 6. Supports both batch + polling behavior.
> >
> > With the above set of features there as some assumptions for idempotency
> &
> > partitioning,
> > 1.User needs to provide tableName,dbConnection,setEmitColumnList,look-up
> > key.
> > 2.Optionally batchSize,pollInterval,Look-up key and a where clause can be
> > given.
> > 3.This operator uses static partitioning to arrive at range queries for
> > exactly once reads
> > 4.Assumption is that there is an ordered column using which range queries
> > can be formed.
> > 5.If an emitColumnList is provided, please ensure that the keyColumn is
> the
> > first column in the list
> > 6.Range queries are formed using the JdbcMetaDataUtility Output - comma
> > separated list of the emit columns eg columnA,columnB,columnC
> >
> > Per window the first and the last key processed is saved using the
> > FSWindowDataManager - (<lowerBound,UpperBound>,operatorId,windowId).This
> > (lowerBound,upperBoundPair) is then used for recovery.The queries are
> > constructed using the JDBCMetaDataUtility.
> >
> > JDBCMetaDataUtility
> > A utility class used to retrieve the metadata for a given unique key of a
> > SQL table. This class would emit range queries based on a primary index
> > given.
> >
> > Presently this operator has been tested with MySQL.
> > In the later iterations we intend to support in-clause support to enable
> > exactly once semantics for non-ordered key column(s).
> >
> > Here's a link to the PR,
> > https://github.com/apache/apex-malhar/pull/282
> >
> > Thanks,
> > Dev
> >
>
>
>
> --
>
> Regards,
> Ashwin.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message