kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Apache Apex supports kudu as a high throughput sink
Date Tue, 30 May 2017 19:50:29 GMT
Hey Ananth,

Thanks for posting this, and for working on the Kudu sink for Apex.

One thing I wanted to note in the article:

"Kudu output operator allows the client side timestamps to be propagated to
the Kudu server where the mutation is executed. This allows for out of
sequence data tuples to be ordered on the server side. The following
snippet of code in the upstream operator shows how this can be done."
I think your understanding of the setPropagatedTimestamp() call is not
quite right. This timestamp propagation serves as a lower-bound for the
assigned timestamp at the server side, not as an exact setting of the
server side timestamp. Thus, if you perform two inserts, and the second
insert has a lower propagated timestamp, it does _not_ ensure that the
first one takes precedence. Since the Propagated Timestamp is a
lower-bound, the second insert will still be assigned a higher timestamp
than the first.

The purpose of this advanced API is to allow causal ordering to be
maintained between two writes. For example, imagine that client A writes
data from machine A, and then communicates with client B on machine B.
Then, client B performs a write. If we want to ensure that B's write is
assigned a higher timestamp than A, the setPropagatedTimestamp() API can
ensure that (by setting A's write's timestamp as the lower bound for B's
write). But, it can't be used to back-date a write as the article seems to
be implying.

Otherwise, the post is great! Thanks again for sharing your experience and


On Tue, May 30, 2017 at 11:33 AM, Ananth G <ananthg.apex@gmail.com> wrote:

> Hello All,
> Apache apex now enables low latency high throughput writes to Kudu as a
> sink. More details on this on the atrato blog here: http://www.atrato.io/
> blog/2017/05/28/apex-kudu-output/ . Please use the comments section to
> provide any feedback.
> Regards,
> Ananth

Todd Lipcon
Software Engineer, Cloudera

View raw message