cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-8861) HyperLogLog Collection Type
Date Tue, 03 Mar 2015 03:12:05 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-8861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksey Yeschenko updated CASSANDRA-8861:
-----------------------------------------
    Fix Version/s: 3.1

> HyperLogLog Collection Type
> ---------------------------
>
>                 Key: CASSANDRA-8861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8861
>             Project: Cassandra
>          Issue Type: Wish
>            Reporter: Drew Kutcharian
>             Fix For: 3.1
>
>
> Considering that HyperLogLog and its variants have become pretty popular in analytics
space and Cassandra has "read-before-write" collections (Lists), I think it would not be too
painful to add support for HyperLogLog "collection" type. They would act similar to CQL 3
Sets, meaning you would be able to "set" the value and "add" an element, but you won't be
able to remove an element. Also, when getting the value of a HyperLogLog collection column,
you'd get the cardinality.
> There are a couple of good attributes with HyperLogLog which fit Cassandra pretty well.
> - Adding an element is idempotent (adding an existing element doesn't change the HLL)
> - HLL can be thought of as a CRDT, since we can safely merge them. Which means we can
merge two HLLs during read-repair. But if that's too much work, I guess we can even live with
LWW since these counts are "estimates" after all.
> There is already a proof of concept at:
> http://vilkeliskis.com/blog/2013/12/28/hacking_cassandra.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message