cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pablo Chacin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4914) Aggregate functions in CQL
Date Mon, 23 Dec 2013 09:39:58 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855523#comment-13855523
] 

Pablo Chacin commented on CASSANDRA-4914:
-----------------------------------------


{quote} the aggregate function would be iteratively called on each grouping. Is that accurate?
Exactly. Basically I'm suggesting that the grouping itself can be done separately by the Selection
object since it's completely generic, it doesn't depend on which aggregate function you'll
execute.
{quote}

[~Sylvain Lebresne] I completely agree with this approach. Actually, I used this pattern,
which is basically a visitor pattern, some time ago when implementing a (small scale) time
series plotting framework and it works nicely. You have a lot of freedom on how you traverse
data (e.g. grouping) with a generic set of functions.  

In the case of Cassandra, however, one mayor concern would be that for partitionable functions
like sum, count, min or max, each node could do its part of aggregation. For non-partitionable
function  like average or percentiles, all the aggregation must be done at the coordinator.


> Aggregate functions in CQL
> --------------------------
>
>                 Key: CASSANDRA-4914
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4914
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Vijay
>            Assignee: Vijay
>             Fix For: 2.1
>
>
> The requirement is to do aggregation of data in Cassandra (Wide row of column values
of int, double, float etc).
> With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for the columns
within a row).
> Example:
> SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC;                        
           
>  empid | deptid | first_name | last_name | salary
> -------+--------+------------+-----------+--------
>    130 |      3 |     joe    |     doe   |   10.1
>    130 |      2 |     joe    |     doe   |    100
>    130 |      1 |     joe    |     doe   |  1e+03
>  
> SELECT sum(salary), empid FROM emp WHERE empID IN (130);                            
       
>  sum(salary) | empid
> -------------+--------
>    1110.1    |  130



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message