cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement
Date Thu, 30 Jun 2016 10:28:11 GMT


Sylvain Lebresne commented on CASSANDRA-10707:

Sorry for the long iterations on the reviews. I still have a bunch of remarks, though a lot
are fairly minor. In general, I "think" the general logic if fine, but it's still a bit hard
to wrap your head around so I'm also listing things that are unclear to me, for which some
extra-commenting might be just what's missing. Anyway, here we go:
* On {{GroupMaker.State}}:
  ** I'd promote it to a top-level {{GroupingState}} class since it's used in {{DataLimits}}
too and would have a nice symmetry with {{PagingState}}.
  ** Any reason not to reuse the {{Clustering.Serializer}} in the serializer (it probably
requires keeping the CFMetaData or ClusteringComparator around for the types but not a big
deal)? In particular, the hand-rolled serialization doesn't handle nulls (and should).
  ** I think it would useful to spell out in more details what having or not having each component
means. My understanding is that if {{partitionKey() != null}} and {{clustering == null}},
it means we haven't started counting the partition at all (meaning that seeing that partition
in the previous page made use close a group and subsequently hit the page limit, hence stopping).
But if {{clustering != null}}, it means we stopped the previous page in the middle of a group.
But I'm not sure I'm fully correct (maybe I'm missing some edge cases in particular?) and
documenting this would be nice.
  ** Nit: The comment on the {{clustering()}} method is incorrect (talks about "partition
* In general, the handling of the static column is a bit subtle, and it would be useful to
have a comment somewhere that explain its general handling in details. For instance, the first
case of {{GroupByAwareCounter.applyToPartition()}} contradicts slightly my understanding of
what {{state.partitionKey()}} means. That is, I though that if reaching a new partition {{X}}
makes us close a group and hit a page limit, then we'll stop and have {{X}} as the partitionKey
in the state. However, on the next page, it seems we assume we've somehow already accounted
the static row and I'm unclear why. The code in {{applyToStatic}} is also bemusing to me:
to start with, what is the subtle difference between the first condition ({{!row.isEmpty()
&& (assumeLiveData || row.hasLiveData(nowInSec))}}) and the second one ({{hasLiveStaticRow}})?
* In {{DataResolver}}, why not just calling {{rowCounted()}} unconditionally (rather than
adding helper methods)?
* In {{PkPrefixGroupMaker}}, instead of dealing with a ByteBuffer array, I'd just kept the
{{Clustering}} of the last row seen (null initially), and the prefix size. We can then even
delegate the check of whether the 2 clustering belong to the same group to some {{Clustering}}
method. On that front, we shouldn't use {{ByteBuffer.equals}} but rather the {{ClusteringComparator}}
types. It happens that some types equality does not align with the bytes equality (ok, one
type, IntegerType, which allow as many zeroed bytes as prefix without changing the actual
* In {{DataLimits.CQLGroupByLimits}} the silent assumption that {{filter()}} should only called
on replica, and {{newCounter()}} on the coordinator, feels pretty error-prone for the caller
(I haven't even carefully checked it's the case tbh). One option would be to rename the methods
to {{filterOnReplica()}} and {{newCounterOnCoordinator()}} but even that feels a bit weird/limiting.
I would prefer removing the {{onReplica}} flag completely and handling that in deserialization:
we can assume when deserializing a {{DataLimits}} that we are on the {{replica}} (it's arguably
still a somewhat silent assumption, but one that feels safer to me; we have no reason to serialize
a {{ReadCommand}} except to send it to replica, while having a need to call {{filter()}} on
the coordinator or {{newCounter}} on replicas in the future sounds very plausible), so we
could just call some {{withoutClustering()}} method on the {{State}} if it's a range query.
Another more explicit alternative could be to add some {{onReplica()}} method to {{ReadCommand}}
that we'd call in {{ReadCommandVerbHandler}} (where we know for sure we are on a replica)
that would return a modified {{ReadCommand}} that would have a modified {{DataLimits}}. That
said, it's probably a bit more code just for that, so I think I'm personally fine with the
deserializer option.
* In {{GroupByAwareCounter}}:
  ** in {{applyToPartition}}, in the comment starting by {{partitionKey is the last key for
which we're returned rows}}, I believe it should read {{state.partitionKey() is the last ...}}.
  ** at end of {{applyToPartition}}, why the {{!isDone()}} test? It seems that since you just
changed the {{currentPartitionKey}}, you should unconditionally change the other variables.
  ** In {{applyToRow}}, why do we put {{hasGroupStarted == false}} as soon as we find a non-live
row? It makes it sound like we "close" a group when we get a non-live row in the middle of
a group.
  ** I'm confused by the 2nd part of the comment on {{onClose()}} (starting with {{The last
group ...}}). I'm not sure to see how that relate to the next line (contrarily to the first
part of the comment), nor am I entirely sure what that part of the comment is trying to say
in general.
  ** Nit: we should move {{nowInSec}} and {{assumeLiveData}} to {{Counter}} and add a {{protected
boolean isLive(Row row)}} helper method.
  ** Nit: don't need to pass the state to the ctor (currently, slightly confusing that the
ctor doesn't store it but its used by other methods). Same for the spec.
* For the documentation change, would be nice to modify the new documentation now that it's
committed. Also, the modification to the textile have some merge marker left in them (I'm
happy if you just want to skip the textile file, I assume we'll probably just remove it soon).
* At the end of {{SelectStatement.getAggregationSpecification}}, I believe the {{clusteringPrefixSize
> 0 && isDistinct}} test is redundant since we'd have exited early in that case.
Happy to make it an assertion, though I'm not sure it's that useful (the code is relatively
* Not sure why in {{ReadCommand}} the patch switches the {{withStateTracking}} and {{withPurgeableTombstone}}
transformation. Intuitively feels to me that having {{withStateTracking}} (which check for
aborts) first is more logical.
* Nit: In {{AggregationSpecification.Serializer}}, I'd add a {{break}} after the {{case}}
in the {{serialize}} and {{serializedSize}} method just to avoid future mistake (would also
maybe add a case for {{AGGREGATE_EVERYTHING}} that does nothing for symmetry with {{desrialize}}).
Also, in {{deserialize}}, [seems people prefer|]
normalizing on putting the {{AssertionError}} in a {{default}} case.
* In {{AggregationQueryPager}}:
  ** {{handlePagingOff()}} is called twice (both before call the iterator ctor on the argument
and inside the ctor).
  ** the pager is doing paging a top-level page, and the naming/javadoc doesn't help not being
confused. As an example, in {{computePageSize(int pageSize, int counted)}}, the "pageSize"
in the method name refers to sub-pages, while the "pageSize" argument is the top-level page
size. Could use some small consistent naming change to help the reader (and some top-level
comment to catch the attention on the "subtlety").
* The NEWS file change should be "rebased".

> Add support for Group By to Select statement
> --------------------------------------------
>                 Key: CASSANDRA-10707
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>             Fix For: 3.x
> Now that Cassandra support aggregate functions, it makes sense to support {{GROUP BY}}
on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the clustering column
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP BY partitionKey,
clustering0, clustering1; 
> {code}

This message was sent by Atlassian JIRA

View raw message