cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1546) (Yet another) approach to counting
Date Thu, 07 Oct 2010 13:05:33 GMT


Sylvain Lebresne commented on CASSANDRA-1546:

The complexity of the context-based logic is to be space efficient. In practice, cassandra
nodes are typically I/O bound, not CPU bound.

True in current patches. But provided you add back timestamp to contexts for decrement and
provided my math doesn't suck too much, the disk overhead for the logic of 1546 is of 3 bytes
* (#replicas-1). Because the overhead of a column is 1 byte for the 'flags' (deleted, expiring)
and 2 bytes for each value to record it's length. That's 3 bytes out of 20 you have to record
for each replica (name (the host ip) + value + timestamp). As it turns out, since for counter
columns, we know the size of the value, it's super easy to optimize out 2 of those 3 bytes
(I'll be happy to add it). To be fair, there is the overhead of using super columns, but overall
I'm not totally convinced by this argument.

I'd like to add that the splitting of the context in multiple columns of 1546 offers some
optimisation opportunity. After the write on the leader, when we read the value to replicate
it to other nodes. In 1546, we only read the value for the leader parts of the counter (since
this is what has been updated). This will save I/Os and network bandwidth. Not saying this
is a crucial thing, just saying that it seems not so clear to me that context-based logic
is intrinsically an I/O saver.

So, I'm not confident in the statement that #1546 is clearly faster than #1072. As a rule
of thumb, it's better to directly manage your memory usage, as opposed to relying on the runtime's

I was merely talking about that the cleanContext() logic. But I'll admit, saying this parts
is faster in 1546 doesn't really matters much, that was a stupid argument. It remains that
I don't like this cleanContext logic. I find it fragile (as in, hard to maintain) and not
very clean, in that it relies on the fact that nodes have to clean up the columns before sending
them over to other nodes. I wouldn't say that this cleanContext logic is a killer for the
context-based approach but I don't like it.

As for the creation of objects, you may be right. But I'm not even sure. The byte array manipulations
of 1072 does create a bunch of temporary byte arrays that have to be garbaged out.  So like
you, I'm not very confident on any statement related to whether the context-based logic is
faster or slower than the counter-as-supercolumns one of #1546.

bq. #1546 will need to special case AES-related streaming, as well

That is true. Which reminded me of a question for you Kelvin. The changes for AES repair in
#1072 are fairly extensive and I kind of wonder why ? I expect the change to fix streaming
in #1546 to be a few lines: that is, when you rebuild the sstable after streaming, you'll
deserialize and re-serialize the rows instead of just copying the bytes directly. I don't
see a reason to differentiate the reason for streaming for instance.

> (Yet another) approach to counting
> ----------------------------------
>                 Key: CASSANDRA-1546
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 0.7.0
>         Attachments: 0001-Remove-IClock-from-internals.patch, 0001-v2-Remove-IClock-from-internals.patch,
0001-v3-Remove-IClock-from-internals.txt, 0002-Counters.patch, 0002-v2-Counters.patch, 0002-v3-Counters.txt,
0003-Generated-thrift-files-changes.patch, 0003-v2-Thrift-changes.patch, 0003-v3-Thrift-changes.txt,
> This could be described as a mix between CASSANDRA-1072 without clocks and CASSANDRA-1421.
> More details in the comment below.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message