cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-8861) HyperLogLog Collection Type
Date Tue, 03 Mar 2015 03:12:05 GMT


Aleksey Yeschenko updated CASSANDRA-8861:
    Fix Version/s: 3.1

> HyperLogLog Collection Type
> ---------------------------
>                 Key: CASSANDRA-8861
>                 URL:
>             Project: Cassandra
>          Issue Type: Wish
>            Reporter: Drew Kutcharian
>             Fix For: 3.1
> Considering that HyperLogLog and its variants have become pretty popular in analytics
space and Cassandra has "read-before-write" collections (Lists), I think it would not be too
painful to add support for HyperLogLog "collection" type. They would act similar to CQL 3
Sets, meaning you would be able to "set" the value and "add" an element, but you won't be
able to remove an element. Also, when getting the value of a HyperLogLog collection column,
you'd get the cardinality.
> There are a couple of good attributes with HyperLogLog which fit Cassandra pretty well.
> - Adding an element is idempotent (adding an existing element doesn't change the HLL)
> - HLL can be thought of as a CRDT, since we can safely merge them. Which means we can
merge two HLLs during read-repair. But if that's too much work, I guess we can even live with
LWW since these counts are "estimates" after all.
> There is already a proof of concept at:

This message was sent by Atlassian JIRA

View raw message