apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanshu Bari <himanshub...@gmail.com>
Subject Re: At-least once semantics for Kafka to Cassndra ingest
Date Wed, 15 Feb 2017 05:04:41 GMT
Hey guys

We are looking for at least once semantics and not exactly once. Sinxe the
sink is Cassandra it is ok if tge same record is written twice..iy will
just overwrite on any reprocessing...

Himanshu

On Feb 14, 2017 9:01 PM, "Priyanka Gugale" <priyanka@datatorrent.com> wrote:

> For this particular case kafka -> cassandra, you need not worry about
> partial windows. Cassandra output operator does batch processing i.e. all
> records received in a window will be written at end window. So IMO, if you
> set exactly once processing on Kafka Input operator, and choose
> transactional cassandra output operator you will achieve exactly once
> processing. If you have other operators in your dag you might want to make
> sure they are idempotent (please check blog shared by Sandesh for
> reference).
>
> -Priyanka
>
> On Wed, Feb 15, 2017 at 4:06 AM, Sandesh Hegde <sandesh@datatorrent.com>
> wrote:
>
> > Settings mentioned by Sanjay, will only guarantee exactly once for
> Windows,
> > but not for partial window processed by the operator, in a way that
> setting
> > is a misnomer.
> > To achieve Exactly once, there are some precoditions that need to be met
> > along with the support in the output operator. Here is a blog that gives
> > the idea about exactly once,
> > https://www.datatorrent.com/blog/end-to-end-exactly-once-
> with-apache-apex/
> >
> > On Tue, Feb 14, 2017 at 2:11 PM Sanjay Pujare <sanjay@datatorrent.com>
> > wrote:
> >
> > > Have you taken a look at
> > > http://apex.apache.org/docs/apex/application_development/#exactly-once
> ?
> > > i.e. setting that processing mode on all the operators in the pipeline
> .
> > >
> > > Join us at Apex Big Data World-San Jose <
> > > http://www.apexbigdata.com/san-jose.html>, April 4, 2017!
> > >
> > > http://www.apexbigdata.com/san-jose-register.html
> > >
> > >
> > > On 2/14/17, 12:00 PM, "Himanshu Bari" <himanshubari@gmail.com> wrote:
> > >
> > >     How to ensure that the Kafka to Cassandra ingestion pipeline in
> Apex
> > > will
> > >     guarantee exactly once processing semantics.
> > >     Eg. Message was read from Kafka but apex app died before it was
> > written
> > >     successfully to Cassandra.
> > >
> > >
> > >
> > > --
> > *Join us at Apex Big Data World-San Jose
> > <http://www.apexbigdata.com/san-jose.html>, April 4, 2017!*
> > [image: http://www.apexbigdata.com/san-jose-register.html]
> > <http://www.apexbigdata.com/san-jose-register.html>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message