cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6412) Custom creation and merge functions for user-defined column types
Date Fri, 06 Dec 2013 10:17:37 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841149#comment-13841149
] 

Sylvain Lebresne commented on CASSANDRA-6412:
---------------------------------------------

bq. As you point out, we can use the same technique as for counter deletion

To be clear, counters don't solve that deletion problem. But counters are only allowed in
counter tables, so at least we can sum it up to "deletes on counter table doesn't work", which
is slightly simpler for users that if counter could be mixed with normal cells. Is this something
we want to consider here?

Let me be clear here. I do not want us to ignore this issue, I'm -1 on committing anything
related to this until we've collectively *decide* on a precise answer to this problem. So
be it if the official answer is we'll just document the problem and otherwise consider that
it's the problem of the user if he don't read the doc and get bitten by this in production.
But if so, we should clearly acknowledge that this is what we're signing for. 

I'll note that what bothers me is not so much the fact that deletes would be broken. It's
the fact that I don't see a good way to prevent users from shooting themselves in the foot.

bq. only deleting al CL.ALL would prevent old values from being merged with newer ones

Not without either 1) a read-before-write or 2) changes to the storage engine. The storage
engine currently doesn't guarantee that cells will always be resolved in the order they are
received by the node, mainly because compaction doesn't guarantee it (and it would be pretty
hard to guarantee, tracking which sstables can be compacted together in which order would
be complicated by repair).


> Custom creation and merge functions for user-defined column types
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-6412
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6412
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Nicolas Favre-Felix
>
> This is a proposal for a new feature, mapping custom types to Cassandra columns.
> These types would provide a creation function and a merge function, to be implemented
in Java by the user.
> This feature relates to the concept of CRDTs; the proposal is to replicate "operations"
on these types during write, to apply these operations internally during merge (Column.reconcile),
and to also merge their values on read.
> The following operations are made possible without reading back any data:
> * MIN or MAX(value) for a column
> * First value for a column
> * Count Distinct
> * HyperLogLog
> * Count-Min
> And any composition of these too, e.g. a Candlestick type includes first, last, min,
and max.
> The merge operations exposed by these types need to be commutative; this is the case
for many functions used in analytics.
> This feature is incomplete without some integration with CASSANDRA-4775 (Counters 2.0)
which provides a Read-Modify-Write implementation for distributed counters. Integrating custom
creation and merge functions with new counters would let users implement complex CRDTs in
Cassandra, including:
> * Averages & related (sum of squares, standard deviation)
> * Graphs
> * Sets
> * Custom registers (even with vector clocks)
> I have a working prototype with implementations for min, max, and Candlestick at https://github.com/acunu/cassandra/tree/crdts
- I'd appreciate any feedback on the design and interfaces.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message