bookkeeper-distributedlog-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xi Liu <xi.liu....@gmail.com>
Subject Re: Proxy Client - Batch Ordering / Commit
Date Fri, 07 Oct 2016 14:33:34 GMT
We investigated DL for a similar use case. I believed 1-5 are already met
with current proxy with atomic-writes. However there is a limitation about
how large a batch can be. The limitation is 1 megabytes, which I believe it
is the limitation of the size of a bookkeeper entry.

6 is guaranteed if using core library directly. it is hard to resend a
record using a thin proxy client since it is unaware of any sequence
numbers.

We are currently proposing adding a transaction semantic to dl to get rid
of the size limitation and the unaware-ness in the proxy client. Here is
our idea -
http://mail-archives.apache.org/mod_mbox/incubator-distributedlog-dev/201609.mbox/%3cCAAC6BxP5YyEHwG0ZCF5soh42X=xuYwYmL4nXsYBYiofzxpVk6g@mail.gmail.com%3e

I am not sure if your idea is similar as ours. but we'd like to collaborate
with the community if anyone has the similar idea.

- Xi


On Tue, Oct 4, 2016 at 3:39 PM, Cameron Hatfield <kinguy@gmail.com> wrote:

> I have a question about the Proxy Client. Basically, for our use cases, we
> want to guarantee ordering at the key level, irrespective of the ordering
> of the partition it may be assigned to as a whole. Due to the source of the
> data (HBase Replication), we cannot guarantee that a single partition will
> be owned for writes by the same client. This means the proxy client works
> well (since we don't care which proxy owns the partition we are writing
> to).
>
>
> However, the guarantees we need when writing a batch consists of:
> Definition of a Batch: The set of records sent to the writeBatch endpoint
> on the proxy
>
> 1. Batch success: If the client receives a success from the proxy, then
> that batch is successfully written
>
> 2. Inter-Batch ordering : Once a batch has been written successfully by the
> client, when another batch is written, it will be guaranteed to be ordered
> after the last batch (if it is the same stream).
>
> 3. Intra-Batch ordering: Within a batch of writes, the records will be
> committed in order
>
> 4. Intra-Batch failure ordering: If an individual record fails to write
> within a batch, all records after that record will not be written.
>
> 5. Batch Commit: Guarantee that if a batch returns a success, it will be
> written
>
> 6. Read-after-write: Once a batch is committed, within a limited time-frame
> it will be able to be read. This is required in the case of failure, so
> that the client can see what actually got committed. I believe the
> time-frame part could be removed if the client can send in the same
> sequence number that was written previously, since it would then fail and
> we would know that a read needs to occur.
>
>
> So, my basic question is if this is currently possible in the proxy? I
> don't believe it gives these guarantees as it stands today, but I am not
> 100% of how all of the futures in the code handle failures.
> If not, where in the code would be the relevant places to add the ability
> to do this, and would the project be interested in a pull request?
>
>
> Thanks,
> Cameron
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message