cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Radhakrishnan <Rajesh.Radhakrish...@phe.gov.uk>
Subject RE: Are Cassandra writes are faster than reads?
Date Tue, 08 Nov 2016 10:20:31 GMT

Hi,

Just found that reducing the batch size below 20 also increases the writing speed and reduction
in memory usage(especially for Python driver).

Kind regards,
Rajesh R

________________________________
From: Ben Bromhead [ben@instaclustr.com]
Sent: 07 November 2016 05:44
To: user@cassandra.apache.org
Subject: Re: Are Cassandra writes are faster than reads?

They can be and it depends on your compaction strategy :)

On Sun, 6 Nov 2016 at 21:24 Ali Akhtar <ali.rac200@gmail.com<redir.aspx?REF=KvuN_F91CkILmAKkPOD8RLOkpaObm4vWZ4CTx2PNAjG8Cvd6wAfUCAFtYWlsdG86YWxpLnJhYzIwMEBnbWFpbC5jb20.>>
wrote:
tl;dr? I just want to know if updates are bad for performance, and if so, for how long.

On Mon, Nov 7, 2016 at 10:23 AM, Ben Bromhead <ben@instaclustr.com<redir.aspx?REF=bOLz-2Z_cjZ-R5mW4ySFRmRgIvYoWF43pRrpxxUsOOC8Cvd6wAfUCAFtYWlsdG86YmVuQGluc3RhY2x1c3RyLmNvbQ..>>
wrote:
Check out https://wiki.apache.org/cassandra/WritePathForUsers<redir.aspx?REF=z6gebtTM9Bi4b1ZEZqnpcgJOwnifCWloccEOX28F8UC8Cvd6wAfUCAFodHRwczovL3dpa2kuYXBhY2hlLm9yZy9jYXNzYW5kcmEvV3JpdGVQYXRoRm9yVXNlcnM.>
for the full gory details.

On Sun, 6 Nov 2016 at 21:09 Ali Akhtar <ali.rac200@gmail.com<redir.aspx?REF=KvuN_F91CkILmAKkPOD8RLOkpaObm4vWZ4CTx2PNAjG8Cvd6wAfUCAFtYWlsdG86YWxpLnJhYzIwMEBnbWFpbC5jb20.>>
wrote:
How long does it take for updates to get merged / compacted into the main data file?

On Mon, Nov 7, 2016 at 5:31 AM, Ben Bromhead <ben@instaclustr.com<redir.aspx?REF=bOLz-2Z_cjZ-R5mW4ySFRmRgIvYoWF43pRrpxxUsOOC8Cvd6wAfUCAFtYWlsdG86YmVuQGluc3RhY2x1c3RyLmNvbQ..>>
wrote:
To add some flavor as to how the commitlog implementation is so quick.

It only flushes to disk every 10s by default. So writes are effectively done to memory and
then to disk asynchronously later on. This is generally accepted to be OK, as the write is
also going to other nodes.

You can of course change this behavior to flush on each write or to skip the commitlog altogether
(danger!). This however will change how "safe" things are from a durability perspective.

On Sun, Nov 6, 2016, 12:51 Jeff Jirsa <jeff.jirsa@crowdstrike.com<redir.aspx?REF=CSJmlUdwjTSoe3NQdZNlO6pFPeaI_KxNpZweB-GbDYO8Cvd6wAfUCAFtYWlsdG86amVmZi5qaXJzYUBjcm93ZHN0cmlrZS5jb20.>>
wrote:

Cassandra writes are particularly fast, for a few reasons:



1)       Most writes go to a commitlog (append-only file, written linearly, so particularly
fast in terms of disk operations) and then pushed to the memTable. Memtable is flushed in
batches to the permanent data files, so it buffers many mutations and then does a sequential
write to persist that data to disk.

2)       Reads may have to merge data from many data tables on disk. Because the writes (described
very briefly in step 1) write to immutable files, updates/deletes have to be merged on read
– this is extra effort for the read path.



If you don’t do much in terms of overwrites/deletes, and your partitions are particularly
small, and your data fits in RAM (probably mmap/page cache of data files, unless you’re
using the row cache), reads may be very fast for you. Certainly individual reads on low-merge
workloads can be < 0.1ms.



-          Jeff



From: Vikas Jaiman <er.vikasjaiman@gmail.com<redir.aspx?REF=VgqqnBUEzP6sLWofnDxFp3iyHQ4TGCTJL8MbqH0NOUK8Cvd6wAfUCAFtYWlsdG86ZXIudmlrYXNqYWltYW5AZ21haWwuY29t>>
Reply-To: "user@cassandra.apache.org<redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>"
<user@cassandra.apache.org<redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>>
Date: Sunday, November 6, 2016 at 12:42 PM
To: "user@cassandra.apache.org<redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>"
<user@cassandra.apache.org<redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>>
Subject: Are Cassandra writes are faster than reads?



Hi all,



Are Cassandra writes are faster than reads ?? If yes, why is this so? I am using consistency
1 and data is in memory.



Vikas

--
Ben Bromhead
CTO | Instaclustr<redir.aspx?REF=N46JHXr59B026V3xSfBozh2xZoVS0DwdAV5Sm_LybJG8Cvd6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v>
+1 650 284 9692<tel:%2B1%20650%20284%209692>
Managed Cassandra / Spark on AWS, Azure and Softlayer

--
Ben Bromhead
CTO | Instaclustr<redir.aspx?REF=Y61HittTE07k3NR47zwHMClylS3zrPdxkOXCEQRVNWUdbPl6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v>
+1 650 284 9692<tel:%2B1%20650%20284%209692>
Managed Cassandra / Spark on AWS, Azure and Softlayer

--
Ben Bromhead
CTO | Instaclustr<redir.aspx?REF=Y61HittTE07k3NR47zwHMClylS3zrPdxkOXCEQRVNWUdbPl6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer

**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely
and for the attention and use of the named addressee(s). It may not be disclosed to any other
person without the express authority of Public Health England, or the intended recipient,
or both. If you are not the intended recipient, you must not disclose, copy, distribute or
retain this message or any part of it. This footnote also confirms that this EMail has been
swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening
or saving. http://www.gov.uk/PHE
**************************************************************************
Mime
View raw message