cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Agrawal, Pratik" <paagr...@amazon.com>
Subject Re: Cassandra Collections performance issue
Date Wed, 24 Feb 2016 22:10:40 GMT
Hi Daemeon,

We tried changing the behavior "we overwrite every value" to update only 1 element in the
map, and still we saw the same performance degradation.

Thanks,
Pratik

From: daemeon reiydelle <daemeonr@gmail.com<mailto:daemeonr@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, February 9, 2016 at 11:39 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Cc: "Peddi, Praveen" <peddi@amazon.com<mailto:peddi@amazon.com>>
Subject: Re: Cassandra Collections performance issue

I think the key to your problem might be around "we overwrite every value". You are creating
a large number of tombstones, forcing many reads to pull current results. You would do well
to rethink why you are having to to overwrite values all the time under the same key. You
would be better to figure out haw to add values under a key then age off the old values. I
would say that (at least at scale) you have a classic anti-pattern in play.


.......

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872

On Mon, Feb 8, 2016 at 5:23 PM, Robert Coli <rcoli@eventbrite.com<mailto:rcoli@eventbrite.com>>
wrote:
On Mon, Feb 8, 2016 at 2:10 PM, Agrawal, Pratik <paagrawa@amazon.com<mailto:paagrawa@amazon.com>>
wrote:
Recently we added one of the table fields from as Map<text, text> in Cassandra 2.1.11.
Currently we read every field from Map and overwrite map values. Map is of size 3. We saw
that writes are 30-40% slower while reads are 70-80% slower. Please find below some metrics
that can help.

My question is, Are there any known issues in Cassandra map performance?  As I understand
it each of the CQL3 Map entry, maps to a column in cassandra, with that assumption we are
just creating 3 columns right? Any insight on this issue would be helpful.

I have previously heard reports along similar lines, but in the other direction.

eg - "I moved from a collection to a TEXT column with JSON in it, and my reads and writes
both became much faster!"

I'm not sure if the issue has been raised as an Apache Cassandra Jira, iow if it is a known
and expected limitation as opposed to just a performance issue.

If I were you, I would consider filing a repro case as a Jira ticket, and responding to this
thread with its URL. :D

=Rob



Mime
View raw message