ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Rudyak <irud...@gmail.com>
Subject Re: Batch support in Cassandra store
Date Sat, 30 Jul 2016 00:19:59 GMT
Hi Valentin,

1) According unlogged batches I think it doesn't make sense to support
them, cause:
- They are deprecated starting from Cassandra 3.0 (which we are currently
using in Cassandra module)
- According to Cassandra documentation (
http://docs.datastax.com/en/cql/3.1/cql/cql_using/useBatch.html) "Batches
are often mistakenly used in an attempt to optimize performance". Cassandra
guys saying that no batches (
https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e#.rxkmfe209)
is the fastest way to load data. I checked it with the batches having
records with different partition keys and it's definitely true. For small
batch of records having all the same partition key (affinity in Ignite)
they could provide better performance, but I didn't investigated this case
deeply (what is the optimal size of a batch, how significantly is the
performance benefits and etc.) Can try to do some load tests to have better
understanding of this.

2) Regarding logged batches I think that it makes sense to support them in
Cassandra module for transactional caches. The bad thing is that they don't
provide isolation, the good thing is they guaranty that all your changes
will be eventually committed and visible to clients. Thus it's still better
than nothing... However there is a better approach for this. We can
implement transactional protocol on top of Cassandra, which will give us
atomic read isolation - you'll either see all the changes made by
transaction or none of them. For example we can implement RAMP transactions(
http://www.bailis.org/papers/ramp-sigmod2014.pdf) cause it provides rather
low overhead.

Igor Rudyak

On Thu, Jul 28, 2016 at 11:00 PM, Valentin Kulichenko <
valentin.kulichenko@gmail.com> wrote:

> Hi Igor,
>
> I'm not a big Cassandra expert, but here are my thoughts.
>
> 1. Sending updates in a batch is always better than sending them one by
> one. For example, if you do putAll in Ignite with 100 entries, and these
> entries are split across 5 nodes, the client will send 5 requests instead
> of 100. This provides significant performance improvement. Is there a way
> to use similar approach in Cassandra?
> 2. As for logged batches, I can easily believe that this is a rarely used
> feature, but since it exists in Cassandra, I can't find a single reason why
> not to support it in our store as an option. Users that come across those
> rare cases, will only say thank you to us :)
>
> What do you think?
>
> -Val
>
> On Thu, Jul 28, 2016 at 10:41 PM, Igor Rudyak <irudyak@gmail.com> wrote:
>
>> There are actually some cases when atomic read isolation in Cassandra
>> could
>> be important. Lets assume batch was persisted in Cassandra, but not
>> finalized yet - read operation from Cassandra returns us only partially
>> committed data of the batch. In the such situation we have problems when:
>>
>> 1) Some of the batch records already expired from Ignite cache and we
>> reading them from persistent store (Cassandra in our case).
>>
>> 2) All Ignite nodes storing the batch records (or subset records) died (or
>> for example became unavailable for 10sec because of network problem).
>> While
>> reading such records from Ignite cache we will be redirected to persistent
>> store.
>>
>> 3) Network separation occurred such a way that we now have two Ignite
>> cluster, but all the replicas of the batch data are located only in one of
>> these clusters. Again while reading such records from Ignite cache on the
>> second cluster we will be redirected to persistent store.
>>
>> In all mentioned cases, if Cassandra batch isn't finalized yet - we will
>> read partially committed transaction data.
>>
>>
>> On Thu, Jul 28, 2016 at 6:52 AM, Luiz Felipe Trevisan <
>> luizfelipe.trevisan@gmail.com> wrote:
>>
>> > I totally agree with you regarding the guarantees we have with logged
>> > batches and I'm also pretty much aware of the performance penalty
>> involved
>> > using this solution.
>> >
>> > But since all read operations are executed via ignite it means that
>> > isolation in the Cassandra level is not really important. I think the
>> only
>> > guarantee really needed is that we don't end up with a partial insert in
>> > Cassandra in case we have a failure in ignite and we loose the node that
>> > was responsible for this write operation.
>> >
>> > My other assumption is that the write operation needs to finish before
>> an
>> > eviction happens for this entry and we loose the data in cache (since
>> batch
>> > doesn't guarantee isolation). However if we cannot achieve this I don't
>> see
>> > why use ignite as a cache store.
>> >
>> > Luiz
>> >
>> > --
>> > Luiz Felipe Trevisan
>> >
>> > On Wed, Jul 27, 2016 at 4:55 PM, Igor Rudyak <irudyak@gmail.com> wrote:
>> >
>> >> Hi Luiz,
>> >>
>> >> Logged batches is not the solution to achieve atomic view of your
>> Ignite
>> >> transaction changes in Cassandra.
>> >>
>> >> The problem with logged batches(aka atomic) is they guarantees that if
>> >> any part of the batch succeeds, all of it will, no other transactional
>> >> enforcement is done at the batch level. For example, there is no batch
>> >> isolation. Clients are able to read the first updated rows from the
>> batch,
>> >> while other rows are still being updated on the server (in RDBMS
>> >> terminology it means *READ-UNCOMMITED* isolation level). Thus Cassandra
>>
>> >> mean "atomic" in the database sense that if any part of the batch
>> succeeds,
>> >> all of it will.
>> >>
>> >> Probably the best way to archive read atomic isolation for Ignite
>> >> transaction persisting data into Cassandra, is to implement RAMP
>> >> transactions (http://www.bailis.org/papers/ramp-sigmod2014.pdf) on top
>> >> of Cassandra.
>> >>
>> >> I may create a ticket for this if community would like it.
>> >>
>> >>
>> >> Igor Rudyak
>> >>
>> >>
>> >> On Wed, Jul 27, 2016 at 12:55 PM, Luiz Felipe Trevisan <
>> >> luizfelipe.trevisan@gmail.com> wrote:
>> >>
>> >>> Hi Igor,
>> >>>
>> >>> Does it make sense for you using logged batches to guarantee atomicity
>> >>> in Cassandra in cases we are doing a cross cache transaction
>> operation?
>> >>>
>> >>> Luiz
>> >>>
>> >>> --
>> >>> Luiz Felipe Trevisan
>> >>>
>> >>> On Wed, Jul 27, 2016 at 2:05 AM, Dmitriy Setrakyan <
>> >>> dsetrakyan@apache.org> wrote:
>> >>>
>> >>>> I am very confused still. Ilya, can you please explain what happens
>> in
>> >>>> Cassandra if user calls IgniteCache.putAll(...) method?
>> >>>>
>> >>>> In Ignite, if putAll(...) is called, Ignite will make the best
>> effort to
>> >>>> execute the update as a batch, in which case the performance is
>> better.
>> >>>> What is the analogy in Cassandra?
>> >>>>
>> >>>> D.
>> >>>>
>> >>>> On Tue, Jul 26, 2016 at 9:16 PM, Igor Rudyak <irudyak@gmail.com>
>> wrote:
>> >>>>
>> >>>> > Dmitriy,
>> >>>> >
>> >>>> > There is absolutely same approach for all async read/write/delete
>> >>>> > operations - Cassandra session just provides
>> executeAsync(statement)
>> >>>> > function
>> >>>> > for all type of operations.
>> >>>> >
>> >>>> > To be more detailed about Cassandra batches, there are actually
two
>> >>>> types
>> >>>> > of batches:
>> >>>> >
>> >>>> > 1) *Logged batch* (aka atomic) - the main purpose of such batches
>> is
>> >>>> to
>> >>>> > keep duplicated data in sync while updating multiple tables,
but at
>> >>>> the
>> >>>> > cost of performance.
>> >>>> >
>> >>>> > 2) *Unlogged batch* - the only specific case for such batch
is when
>> >>>> all
>> >>>> > updates are addressed to only *one* partition key and batch
having
>> >>>> > "*reasonable
>> >>>> > size*". In a such situation there *could be* performance benefits
>> if
>> >>>> you
>> >>>> > are using Cassandra *TokenAware* load balancing policy. In
this
>> >>>> particular
>> >>>> > case all the updates will go directly without any additional
>> >>>> > coordination to the primary node, which is responsible for
storing
>> >>>> data for
>> >>>> > this partition key.
>> >>>> >
>> >>>> > The *generic rule* is that - *individual updates using async
mode*
>> >>>> provides
>> >>>> > the best performance (
>> >>>> > https://docs.datastax.com/en/cql/3.1/cql/cql_using/useBatch.html).
>> >>>> That's
>> >>>> > because it spread all updates across the whole cluster. In
>> contrast to
>> >>>> > this, when you are using batches, what this is actually doing
is
>> >>>> putting a
>> >>>> > huge amount of pressure on a single coordinator node. This
is
>> because
>> >>>> the
>> >>>> > coordinator needs to forward each individual insert/update/delete
>> to
>> >>>> the
>> >>>> > correct replicas. In general you're just losing all the benefit
of
>> >>>> > Cassandra TokenAware load balancing policy when you're updating
>> >>>> different
>> >>>> > partitions in a single round trip to the database.
>> >>>> >
>> >>>> > Probably the only enhancement which could be done is to separate
>> our
>> >>>> batch
>> >>>> > to smaller batches, each of which is updating records having
the
>> same
>> >>>> > partition key. In this case it could provide some performance
>> >>>> benefits when
>> >>>> > used in combination with Cassandra TokenAware policy. But there
are
>> >>>> several
>> >>>> > concerns:
>> >>>> >
>> >>>> > 1) It looks like rather rare case
>> >>>> > 2) Makes error handling more complex - you just don't know
what
>> >>>> operations
>> >>>> > in a batch succeed and what failed and need to retry all batch
>> >>>> > 3) Retry logic could produce more load on the cluster - in
case of
>> >>>> > individual updates you just need to retry the only mutations
which
>> are
>> >>>> > failed, in case of batches you need to retry the whole batch
>> >>>> > 4)* Unlogged batch is deprecated in Cassandra 3.0* (
>> >>>> >
>> https://docs.datastax.com/en/cql/3.3/cql/cql_reference/batch_r.html),
>> >>>> > which
>> >>>> > we are currently using for Ignite Cassandra module.
>> >>>> >
>> >>>> >
>> >>>> > Igor Rudyak
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > On Tue, Jul 26, 2016 at 4:45 PM, Dmitriy Setrakyan <
>> >>>> dsetrakyan@apache.org>
>> >>>> > wrote:
>> >>>> >
>> >>>> > >
>> >>>> > >
>> >>>> > > On Tue, Jul 26, 2016 at 5:53 PM, Igor Rudyak <irudyak@gmail.com>
>> >>>> wrote:
>> >>>> > >
>> >>>> > >> Hi Valentin,
>> >>>> > >>
>> >>>> > >> For writeAll/readAll Cassandra cache store implementation
uses
>> >>>> async
>> >>>> > >> operations (
>> >>>> http://www.datastax.com/dev/blog/java-driver-async-queries)
>> >>>> > >> and
>> >>>> > >> futures, which has the best characteristics in terms
of
>> >>>> performance.
>> >>>> > >>
>> >>>> > >>
>> >>>> > > Thanks, Igor. This link describes the query operations,
but I
>> could
>> >>>> not
>> >>>> > > find the mention of writes.
>> >>>> > >
>> >>>> > >
>> >>>> > >> Cassandra BATCH statement is actually quite often
anti-pattern
>> for
>> >>>> those
>> >>>> > >> who come from relational world. BATCH statement concept
in
>> >>>> Cassandra is
>> >>>> > >> totally different from relational world and is not
for
>> optimizing
>> >>>> > >> batch/bulk operations. The main purpose of Cassandra
BATCH is to
>> >>>> keep
>> >>>> > >> denormalized data in sync. For example when you duplicating
the
>> >>>> same
>> >>>> > data
>> >>>> > >> into several tables. All other cases are not recommended
for
>> >>>> Cassandra
>> >>>> > >> batches:
>> >>>> > >>  -
>> >>>> > >>
>> >>>> > >>
>> >>>> >
>> >>>>
>> https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e#.k4xfir8ij
>> >>>> > >>  -
>> >>>> > >>
>> >>>> > >>
>> >>>> >
>> >>>>
>> http://christopher-batey.blogspot.com/2015/02/cassandra-anti-pattern-misuse-of.html
>> >>>> > >>  -
>> >>>> https://inoio.de/blog/2016/01/13/cassandra-to-batch-or-not-to-batch/
>> >>>> > >>
>> >>>> > >> It's also good to mention that in CassandraCacheStore
>> >>>> implementation
>> >>>> > >> (actually in CassandraSessionImpl) all operation with
Cassandra
>> is
>> >>>> > wrapped
>> >>>> > >> in a loop. The reason is in a case of failure it will
be
>> performed
>> >>>> 20
>> >>>> > >> attempts to retry the operation with incrementally
increasing
>> >>>> timeouts
>> >>>> > >> starting from 100ms and specific exception handling
logic
>> >>>> (Cassandra
>> >>>> > hosts
>> >>>> > >> unavailability and etc.). Thus it provides quite reliable
>> >>>> persistence
>> >>>> > >> mechanism. According to load tests, even on heavily
overloaded
>> >>>> Cassandra
>> >>>> > >> cluster (CPU LOAD > 10 per one core) there were
no lost
>> >>>> > >> writes/reads/deletes and maximum 6 attempts to perform
one
>> >>>> operation.
>> >>>> > >>
>> >>>> > >
>> >>>> > > I think that the main point about Cassandra batch operations
is
>> not
>> >>>> about
>> >>>> > > reliability, but about performance. If user batches up
100s of
>> >>>> updates
>> >>>> > in 1
>> >>>> > > Cassandra batch, then it will be a lot faster than doing
them
>> >>>> 1-by-1 in
>> >>>> > > Ignite. Wrapping them into Ignite "putAll(...)" call just
seems
>> more
>> >>>> > > logical to me, no?
>> >>>> > >
>> >>>> > >
>> >>>> > >>
>> >>>> > >> Igor Rudyak
>> >>>> > >>
>> >>>> > >> On Tue, Jul 26, 2016 at 1:58 PM, Valentin Kulichenko
<
>> >>>> > >> valentin.kulichenko@gmail.com> wrote:
>> >>>> > >>
>> >>>> > >> > Hi Igor,
>> >>>> > >> >
>> >>>> > >> > I noticed that current Cassandra store implementation
doesn't
>> >>>> support
>> >>>> > >> > batching for writeAll and deleteAll methods,
it simply
>> executes
>> >>>> all
>> >>>> > >> updates
>> >>>> > >> > one by one (asynchronously in parallel).
>> >>>> > >> >
>> >>>> > >> > I think it can be useful to provide such support
and created a
>> >>>> ticket
>> >>>> > >> [1].
>> >>>> > >> > Can you please give your input on this? Does
it make sense in
>> >>>> your
>> >>>> > >> opinion?
>> >>>> > >> >
>> >>>> > >> > [1] https://issues.apache.org/jira/browse/IGNITE-3588
>> >>>> > >> >
>> >>>> > >> > -Val
>> >>>> > >> >
>> >>>> > >>
>> >>>> > >
>> >>>> > >
>> >>>> >
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message