cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1546) (Yet another) approach to counting
Date Fri, 08 Oct 2010 08:41:33 GMT


Sylvain Lebresne commented on CASSANDRA-1546:

bq. I read through your note on the v3 marker strategy. It sounds reasonable and does address
the concerns that I raised, earlier.  I think it's worth it for you to highlight the potential
drawbacks. The three that seem to stick out the most are:

Sure, I've tried to highlight them along the way but it's good to sum them up.  And I mostly
agree with your list (that is, I agree with 2) and 3), not so much with 1), at least not with
what the wording suggests). Allow me a few comments:

bq. 1) it requires 1-2 reads in a synchronized code path, which doesn't gel w/ cassandra's
write-optimized design,

Yes, there is a read in a synchronized code path, but I strongly disagree with the last part
and overall I think this is by far the least important drawback of the list. This read in
a synchronized code path (always 1 btw) happens only when we try to repair a replayed update.
First and foremost, unless a write fails (timeout or disconnection from coordinator) and you
replay it, you will never exercise that code path. Again, unless you replay an update, there
is virtually no cost at all (as far as latency is involved at least). If you replay an update,
then yes, you may have to pay that cost. But when I think of cassandra's write-optimized design,
I also include the fact that you won't lost updates no matter what. Provided that the goal
of the marker strategy is to ensure that, I would say that it actually fits very well the
cassandra's write-optimized design.

Also, from the technical side, the synchronized is not a global lock. The current implementation
uses a small pool of locks to avoid too much contention while still avoiding allocating too
much object. I'm pretty sure contention won't be a problem as is, but if we observe contention,
we can always increase slightly this pool lock until contention disappear. The locks are needed
for correctness, but I'm pretty sure they won't be costly.

2) over-counts are repaired in an eventually consistent manner, and

True, and that's a bit unfortunate. Sadly I really don't see how we can avoid this. I have
hopes however that we can optimize that eventuality to be as short as possible (and as illustrated
by the system test of the patch (that will never fail even in the absence of any delay between
the replay and the read), in some situation it's not even eventual).

3) UUIDs are only maintained for a configurable TTL.

Yes. For now this TTL is gc_grace_seconds but I think it should be another configurable time
for more flexibility. And actually, you can keep the uuids forever if you want. But I admit,
to know the best TTL to choose could be a bit tricky and here too, I hope this can be optimized

bq. The above aren't showstoppers. However, anyone interested in using UUIDs to track updates
should be aware of their limitations + trade-offs.

Agreed and I'll be happy to write that documentation in time. And that doc will probably start
by explaining the drawbacks of not using uuids :)

> (Yet another) approach to counting
> ----------------------------------
>                 Key: CASSANDRA-1546
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 0.7.0
>         Attachments: 0001-Remove-IClock-from-internals.patch, 0001-v2-Remove-IClock-from-internals.patch,
0001-v3-Remove-IClock-from-internals.txt, 0002-Counters.patch, 0002-v2-Counters.patch, 0002-v3-Counters.txt,
0003-Generated-thrift-files-changes.patch, 0003-v2-Thrift-changes.patch, 0003-v3-Thrift-changes.txt,
> This could be described as a mix between CASSANDRA-1072 without clocks and CASSANDRA-1421.
> More details in the comment below.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message