cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Dejanovski <a...@thelastpickle.com>
Subject Re: Can a Select Count(*) Affect Writes in Cassandra?
Date Thu, 10 Nov 2016 14:41:58 GMT
Shalom,

you may have a high trace probability which could explain what you're
observing :
https://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsSetTraceProbability.html

On Thu, Nov 10, 2016 at 3:37 PM Chris Lohfink <clohfink85@gmail.com> wrote:

> count(*) actually pages through all the data. So a select count(*) without
> a limit would be expected to cause a lot of load on the system. The hit is
> more than just IO load and CPU, it also creates a lot of garbage that can
> cause pauses slowing down the entire JVM. Some details here:
> http://www.datastax.com/dev/blog/counting-keys-in-cassandra
> <http://planetcassandra.org/blog/counting-key-in-cassandra/>
>
> You may want to consider maintaining the count yourself, using Spark, or
> if you just want a ball park number you can grab it from JMX.
>
> > Cassandra writes (mutations) are INSERTs, UPDATEs or DELETEs, it
> actually has nothing to do with flushes. A flush is the operation of moving
> data from memory (memtable) to disk (SSTable).
>
> FWIW in 2.0 thats not completely accurate. Before 2.1 the process of
> memtable flushing acquired a switchlock on that blocks mutations during the
> flush (the "pending task" metric is the measure of how many mutations are
> blocked by this lock).
>
> Chris
>
> On Thu, Nov 10, 2016 at 8:10 AM, Shalom Sagges <shaloms@liveperson.com>
> wrote:
>
> Hi Alexander,
>
> I'm referring to Writes Count generated from JMX:
> [image: Inline image 1]
>
> The higher curve shows the total write count per second for all nodes in
> the cluster and the lower curve is the average write count per second per
> node.
> The drop in the end is the result of shutting down one application node
> that performed this kind of query (we still haven't removed the query
> itself in this cluster).
>
>
> On a different cluster, where we already removed the "select count(*)"
> query completely, we can see that the issue was resolved (also verified
> this with running nodetool cfstats a few times and checked the write count
> difference):
> [image: Inline image 2]
>
>
> Naturally I asked how can a select query affect the write count of a node
> but weird as it seems, the issue was resolved once the query was removed
> from the code.
>
> Another side note.. One of our developers that wrote the query in the
> code, thought it would be nice to limit the query results to 560,000,000.
> Perhaps the ridiculously high limit might have caused this?
>
> Thanks!
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035
> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
> On Thu, Nov 10, 2016 at 3:21 PM, Alexander Dejanovski <
> alex@thelastpickle.com> wrote:
>
> Hi Shalom,
>
> Cassandra writes (mutations) are INSERTs, UPDATEs or DELETEs, it actually
> has nothing to do with flushes. A flush is the operation of moving data
> from memory (memtable) to disk (SSTable).
>
> The Cassandra write path and read path are two different things and, as
> far as I know, I see no way for a select count(*) to increase your write
> count (if you are indeed talking about actual Cassandra writes, and not I/O
> operations).
>
> Cheers,
>
> On Thu, Nov 10, 2016 at 1:21 PM Shalom Sagges <shaloms@liveperson.com>
> wrote:
>
> Yes, I know it's obsolete, but unfortunately this takes time.
> We're in the process of upgrading to 2.2.8 and 3.0.9 in our clusters.
>
> Thanks!
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
> On Thu, Nov 10, 2016 at 1:31 PM, Vladimir Yudovin <vladyu@winguzone.com>
> wrote:
>
> As I said I'm not sure about it, but it will be interesting to check
> memory heap state with any JMX tool, e.g.
> https://github.com/patric-r/jvmtop
>
> By a way, why Cassandra 2.0.14? It's quit old and unsupported version.
> Even in 2.0 branch there is 2.0.17 available.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
> ---- On Thu, 10 Nov 2016 05:47:37 -0500*Shalom Sagges
> <shaloms@liveperson.com <shaloms@liveperson.com>>* wrote ----
>
> Thanks for the quick reply Vladimir.
> Is it really possible that ~12,500 writes per second (per node in a 12
> nodes DC) are caused by memory flushes?
>
>
>
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035
> <http://www.linkedin.com/company/164748>
> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc>
> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
>
> On Thu, Nov 10, 2016 at 11:02 AM, Vladimir Yudovin <vladyu@winguzone.com>
> wrote:
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
> Hi Shalom,
>
> so not sure, but probably excessive memory consumption by this SELECT
> causes C* to flush tables to free memory.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
> ---- On Thu, 10 Nov 2016 03:36:59 -0500*Shalom Sagges
> <shaloms@liveperson.com <shaloms@liveperson.com>>* wrote ----
>
> Hi There!
>
> I'm using C* 2.0.14.
> I experienced a scenario where a "select count(*)" that ran every minute
> on a table with practically no results limit (yes, this should definitely
> be avoided), caused a huge increase in Cassandra writes to around 150
> thousand writes per second for that particular table.
>
> Can anyone explain this behavior? Why would a Select query significantly
> increase write count in Cassandra?
>
> Thanks!
>
>
> Shalom Sagges
>
> <http://www.linkedin.com/company/164748>
> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc>
> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
> --
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
> --
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Mime
View raw message