cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pawel Matras (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10792) Make warning level for count of partitions in unlogged BatchStatement configurable.
Date Wed, 02 Dec 2015 08:33:10 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035464#comment-15035464
] 

Pawel Matras commented on CASSANDRA-10792:
------------------------------------------

Sorry for creating confusion. Perhaps this comes from too long explanation or perhaps because
I would like to give my vote for unlogged batches as they do perform better in some deployment
scenarios (as I tried to explain above). Actually, I am a little bit afraid that they are
on their way out of Cassandra as you recommend against them and warn users that they should
do something else instead.

{quote}
It is my understanding that while we recommend against it, submitting batches for multiple
partitions is currently possible. Is this not the case?
{quote}
Yes, it is possible. Creating of unlogged batches for multiple partitions works. Unfortunately
in newer versions of Cassandra they produce lots of warnings in the server logs and it is
quite hard to turn these warnings off. So my  improvement proposal suggests to introduce configuration
parameter in cassandra.yaml e.g.
{noformat}
unlogged_batch_partitions_warn_threshold: 1
{noformat}
similar to existing {{batch_size_warn_threshold_in_kb}}. This new parameter will then be used
in BatchStatement.verifyBatchType instead of todays hardcoded 1.

{quote}
What is the CQL syntax of the statement you are trying, and the schema of the table you are
using?
{quote}
I am using DataStax java driver. Batch statement itself is created programmatically {{new
BatchStatement( BatchStatement.Type.UNLOGGED )}}. Then  inserts are added to it by binding
PreparedStatement created from CQL. Batch is executed asynchronously. Table is similar to
(sorry, I am not allowed to provide details):
{noformat}
create table MY_TABLE (
   C1 varchar,
   C2 varchar,
   C3 varchar,
   C4 blob,
   primary key( (C1, C2), C3 )
);
{noformat}

In hope of clarification, best regards

PaweĊ‚

> Make warning level for count of partitions in unlogged BatchStatement configurable.
> -----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10792
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10792
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>         Environment: CentOS,  JDK 1.7, cluster of servers with 32 CPU each.
>            Reporter: Pawel Matras
>            Priority: Minor
>             Fix For: 3.2
>
>
> Currently mutations for only one partition are allowed in BatchStatement. This amount
is hard coded and not configurable as e.g. batch size.
> General suggestion looks to be to consider multi partition unlogged batch statements
as distributed anti pattern. As cure async inserts are proposed. Proposal might be OK if one
consider only cassandra side of the system.
> If a complete system have to share the same hardware things does not look so obvious
any more. When cassandra shares the same hardware with other also clustered components lots
of small async inserts become another well known distributed anti pattern. 
> In our case changing from unlogged batches to async inserts destabilize the system as
async inserts require amounts of interrupts per second and context switches per second beyond
what hardware can handle. With replica aware unlogged batches these parameters are at 40%
of hardware limits and the system as a whole runs stable without visible hotspots (cassandra
metrics looks almost the same on all nodes).
> Unfortunately latest versions (we switched from 2.0.6 to 2.2.1) of cassandra log lots
of warnings of type "Unlogged batch covering NN partitions detected against table XXX...".
I found two workarounds to avoid this warning. The one is to use logged batches. But they
generate 20% more interrupts and context switches and 400% more network load. The other way
is more a hack and uses filtering features of logging framework and suppresses the warning
just before it get logged.
> So my suggestion is to allow users of cassandra to decide by configuration if they consider
unlogged batches antipattern or not. This is partially done with the size of the batch, where
size is configurable and not hardcoded to 5 kB.
> It would be fine to stay consistent on this and let the user configure mutations for
how many partitions to allow.
> Of course there are several other solutions possible here, probably more costly. E.g
warning could be produced only if batch is not token/replica aware.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message