cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Singh <rahul.xavier.si...@gmail.com>
Subject Re: [EXTERNAL] Re: Cassandra rate dropping over long term test
Date Tue, 07 Aug 2018 13:18:59 GMT
Do you have any warnings for large partitions?

Rahul
On Aug 7, 2018, 5:37 AM -0400, Mihai Stanescu <mihai.stanescu@gmail.com>, wrote:
> There was probably a major compaction on node 6 thus the reducing of tombstones but somehow
it does not recover again from iops.
>
>
>
>
>
> > On Tue, Aug 7, 2018 at 11:32 AM, Mihai Stanescu <mihai.stanescu@gmail.com>
wrote:
> > > Hi,
> > >
> > > I collected more metrics see below.
> > >
> > > It seems the cluster is destabilizing but i cannot clearly  see why.
> > >
> > > One weard bit is that if i restart the client everything is ok again. Could
it be that there are some problems with the session which is kept open for too long? or something?
The rate of requests hitting the server seems ok and seems the statements are executed ok.
> > >
> > >
> > > There is a decription of the table
> > >
> > > CREATE TABLE demodb.topic_message (
> > >     topic_name text,
> > >     message_index int,
> > >     crated_at bigint,
> > >     message blob,
> > >     tag text,
> > >     type text,
> > >     uuid text,
> > >     PRIMARY KEY (topic_name, message_index)
> > > ) WITH CLUSTERING ORDER BY (message_index ASC)
> > >     AND bloom_filter_fp_chance = 0.01
> > >     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> > >     AND comment = ''
> > >     AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
> > >     AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
> > >     AND crc_check_chance = 1.0
> > >     AND dclocal_read_repair_chance = 0.1
> > >     AND default_time_to_live = 0
> > >     AND gc_grace_seconds = 0
> > >     AND max_index_interval = 2048
> > >     AND memtable_flush_period_in_ms = 0
> > >     AND min_index_interval = 128
> > >     AND read_repair_chance = 0.0
> > >     AND speculative_retry = '99PERCENTILE';
> > >
> > >
> > > message_index is an incrementing sequence per topic_name.
> > >
> > >
> > > > I wonder if you are building up tombstones with the deletes
> > >
> > > There should be just one tombstone per entry give our model. The access pattern
is also kind of time localized because the operations to read/delete are done on recent inserted
rows.
> > >
> > > Actually you can see that the red line actually drops when the read IOPS start.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > > Any warnings in your system.log for reading through too many tombstones?
> > >
> > > Nope
> > >
> > > One thing to notice is that after we restarted the client the cluster recovered.
> > >
> > >
> > > Regards,
> > > Mihai
> > >
> > >
> > > > On Fri, Aug 3, 2018 at 10:35 PM, Durity, Sean R <SEAN_R_DURITY@homedepot.com>
wrote:
> > > > > I wonder if you are building up tombstones with the deletes. Can
you share your data model? Are the deleted rows using the same partition key as new rows?
Any warnings in your system.log for reading through too many tombstones?
> > > > >
> > > > >
> > > > > Sean Durity
> > > > >
> > > > > From: Mihai Stanescu <mihai.stanescu@gmail.com>
> > > > > Sent: Friday, August 03, 2018 12:03 PM
> > > > > To: user@cassandra.apache.org
> > > > > Subject: [EXTERNAL] Re: Cassandra rate dropping over long term test
> > > > >
> > > > > I looked at the compaction history on the affected node when it was
affected and it was not affected.
> > > > >
> > > > > The number of compactions is fairly similar and the amount of work
also.
> > > > >
> > > > > Not affected time
> > > > > [root@cassandra7 ~]# nodetool compactionhistory | grep 02T22
> > > > > fda43ca0-9696-11e8-8efb-25b020ed0402 demodb        topic_message
    2018-08-02T22:59:47.946 433124864  339496194  {1:3200576, 2:2025936, 3:262919}
> > > > > 8a83e2c0-9696-11e8-8efb-25b020ed0402 demodb        topic_message
    2018-08-02T22:56:34.796 133610579  109321990  {1:1574352, 2:434814}
> > > > > 01811e20-9696-11e8-8efb-25b020ed0402 demodb        topic_message
    2018-08-02T22:52:44.930 132847372  108175388  {1:1577164, 2:432508}
> > > > >
> > > > > Experiencing more ioread
> > > > > [root@cassandra7 ~]# nodetool compactionhistory | grep 03T12
> > > > > 389aa220-970c-11e8-8efb-25b020ed0402 demodb        topic_message
    2018-08-03T12:58:57.986 470326446  349948622  {1:2590960, 2:2600102, 3:298369}
> > > > > 81fe6f10-970b-11e8-8efb-25b020ed0402 demodb        topic_message
    2018-08-03T12:53:51.617 143850880  115555226  {1:1686260, 2:453627}
> > > > > ce418e30-970a-11e8-8efb-25b020ed0402 demodb        topic_message
    2018-08-03T12:48:50.067 147035600  119201638  {1:1742318, 2:452226}
> > > > >
> > > > > During a read operation the row can mostly be in one sstable since
was only inserted and then read so its strange.
> > > > >
> > > > > We have a partition key and then a clustering key.
> > > > >
> > > > > Rows that are written should be in kernel buffers and the rows which
are lost to delete are never read again either so the kernel should have only the most recent
data.
> > > > >
> > > > > I remain puzzled
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Aug 3, 2018 at 3:58 PM, Jeff Jirsa <jjirsa@gmail.com>
wrote:
> > > > > > quote_type
> > > > > > Probably Compaction
> > > > > >
> > > > > > Cassandra data files are immutable
> > > > > >
> > > > > > The write path first appends to a commitlog, then puts data
into the memtable. When the memtable hits a threshold, it’s flushed to data files on disk
(let’s call the first one “1”, second “2” and so on)
> > > > > >
> > > > > > Over time we build up multiple data files on disk - when Cassandra
reads, it will merge data in those files to give you the result you expect, choosing the latest
value for each column
> > > > > >
> > > > > > But it’s usually wasteful to lots of files around, and that
merging is expensive, so compaction combines those data files behind the scenes in a background
thread.
> > > > > >
> > > > > > By default they’re combined when 4 or more files are approximately
the same size, so if your write rate is such that you fill and flush the memtable every 5
minutes, compaction will likely happen at least every 20 minutes (sometimes more). This is
called size tiered compaction; there are 4 strategies but size tiered is default and easiest
to understand.
> > > > > >
> > > > > > You’re seeing mostly writes because the reads are likely in
page cache (the kernel doesn’t need to go to disk to read the files, it’s got them in
memory for serving normal reads).
> > > > > >
> > > > > > --
> > > > > > Jeff Jirsa
> > > > > >
> > > > > >
> > > > > > > On Aug 3, 2018, at 12:30 AM, Mihai Stanescu <mihai.stanescu@gmail.com>
wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I am perftesting cassandra over a longrun in a cluster
of 8 nodes and i noticed the rate of service drops.
> > > > > > > Most of the nodes have the CPU between 40-65% however one
of the nodes has a higher CPU and also started performing a lot of read IOPS as seen in the
image. (green is read IOPS)
> > > > > > >
> > > > > > > My test has a mixed rw scenario.
> > > > > > > 1. insert row
> > > > > > > 2. after 60 seconds read row
> > > > > > > 3. delete row.
> > > > > > >
> > > > > > > The rate of inserts is bigger than the rate of deletes
so some delete will not happen.
> > > > > > >
> > > > > > > I have checked the client it it does not accumulate RAM,
GC is a straight line so o don't understand whats going on.
> > > > > > >
> > > > > > > Any hints?
> > > > > > >
> > > > > > > Regards,
> > > > > > > MIhai
> > > > > > >
> > > > > > > <image.png>
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > > > > For additional commands, e-mail: user-help@cassandra.apache.org
> > > > >
> > > > >
> > > > >
> > > > > The information in this Internet Email is confidential and may be
legally privileged. It is intended solely for the addressee. Access to this Email by anyone
else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution
or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful.
When addressed to our clients any opinions or advice contained in this Email are subject to
the terms and conditions expressed in any applicable governing The Home Depot terms of business
or client engagement letter. The Home Depot disclaims all responsibility and liability for
the accuracy and content of this attachment and for any damages or losses arising from any
inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive
nature, which may be contained in this attachment and shall not be liable for direct, indirect,
consequential or special damages in connection with this e-mail message or its attachment.
> > >
>

Mime
View raw message