cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement
Date Wed, 24 Feb 2016 19:52:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163608#comment-15163608
] 

Sylvain Lebresne commented on CASSANDRA-10707:
----------------------------------------------

The first point that needs discussing is that this patch changes the inter-node protocol and
this without bumping the protocol version, which is something we've never done.

On the one side bumping the protocol version is something we've kept for major releases and
so far planned not to do before 4.0. But on the other side, not being able to add anything
to the inter-node messages limits a lot new features (like this one) and doesn't make sense
in a tick-tock world.

We could accept this kind of change without bumping the version on the argument that it's
only "additive" to the serialization format and that it breaks nothing if the new feature
is not used, and we'd document that "you shall not use GROUP BY until all nodes in the cluster
are updated to a version that supports it". But it's not something to treat lightly because
if a user don't respect that rule, older node will fail deserialization of the message, which
will drop the connection, which can impact other on-ongoing queries, which is not good.

Alternatively, we could simply bump the protocol version, forgetting the "not before 4.0"
rule. The advantage is that if we do so we can have the sending node (instead of the receiving
one) throw if a GROUP BY is used but not all nodes are up to date, which is cleaner and avoid
the connection dropped problem.  

So something we need to decide first, and maybe we need a separate ticket to have that discussion.

Anyway, outside of this question, some general remarks on the patch:
* In the parser, we have the {{GROUP BY}} before the {{ORDER BY}}. On the one side it's more
consistent with {{SQL}} but on the other side, we do happen to order before we group. And
we kind of know that post-query ordering is not really something we can do due to paging (we
do happen to do it in a few case when paging is not involved out of backward compatibility
but given the limitation, I don't think we'll want to ever extend that) so it does feel the
{{ORDER BY}} should come before {{GROUP BY}}.
* The {{forPagingByQueryPager}} and {{forGroupByInternalPaging}} methods are not very intuitive:
it's unclear when just looking at {{DataLimits}} why they exist and why you should use them
or not. I'm also not entirely sure we need the difference between the two. If I understand
correctly, the problem is that when a replica ends a page in the middle of a group, the coordinator
don't know if said replica has stopped because it reaches the end of a group which was the
last one to return (in which case we're done), or if the replica stopped because of the page
limit (but the group itself is not finished). But the coordinator should be able to figure
that out by checking it's row counts: if after having consumed the page from the replica we've
reached the group limit _or_ there was less row in the page than the page limit, then we're
done, otherwise we should ask for more.  Granted, that does mean in the rare case where we've
exhausted all data exactly on the page limit, we'll do an additional query (that will be empty),
but pretty sure you have that problem in one form or another whatever you do.  Am I missing
something here?
* Not sure how I feel about {{GroupSelector}}. It's a bit confusing as of this patch and while
I get that its for allowing functions later, it might not be as generic as we may need. For
instance, if we ever want to support function on multiple columns (which would be possible,
assuming those are consecutive PK columns), this won't work, at least not as implemented.
I think I'd slightly prefer removing {{GroupSelector}} for now and just rely on extending
{{GroupBySpecification}} for future extension (typically we'll add a {{withFunctionGroupBySpec}}
which would have all liberty as to how it is implemented).  Doing so also mean we can simplify
the probably common and only currently supported case as we just need to keep the length of
the prefix we group by rather than a list of (dumb) selector.
* I feel keeping/passing the {{lastPartitionKey}} and {{lastClustering}} in {{DataLimits.CQLGroupByLimits}}
and {{GroupMaker}} is not as future proof/generic as the patch is otherwise trying to be.
Typically, if we allow functions, the {{lastClustering}} won't really be a clustering. Granted,
it'll still fit into a {{ByteBuffer[]}} but it'll be a bit of an abuse. So I would suggest
adding {{GroupMaker.State}} concept whose implementation would be specific to the {{GroupMaker/GroupBySpecification}}
but would be transparent to other code. Imo that would make some code cleaner (for instance,
{{GroupMaker.GROUP_EVERYTHING}} wouldn't need to store anything, which makes sense).
* {{GroupBySpeccification.NO_GROUPING}} feels confusing because it really means the query
is not an aggregate, while the naming makes it sound like it has no {{GROUP BY}}.  Similarly,
the {{isGrouping}} method really mean {{hasAggregate}}. In fact, I think it would be more
clear to just have no {{GroupBySpecification}} in that case, keeping it {{null}} in SelectStatement
(and in {{Selection}}). Having {{groupBySpec != null}} to mean no aggregate is imo more clear
than {{!groupBySpec.isGrouping()}}. I'm also confused as to why {{GroupMaker.NO_GROUPING}}
even exists, we shouldn't ever create a {{GroupMaker}} in the first place if we're not aggregating
at all.
* In {{DataLimits}}, the naming of {{noGroupByLimits}} (and {{distinctNoGroupByLimits}}) is
confusing (I mean, the both create a {{CQLGroupByLimits}}!). I think we could remove then
and just pass {{cqlRowLimit}} directly as first argument in {{SelectStatement.getDataLimits}}.
* Related to the previous point, some phrasing is a tad confusing, like a comment saying {{GroupByPartitionIterator
for queries without Group By}} in {{GroupByQueryPager}}. I think that's because {{GroupSpecification}},
{{GroupByLimits}} and {{GroupByQueryPager}} classes are really about handling aggregates in
general, not query having an explicit {{GROUP BY}}.  So maybe renaming to {{AggregationSpecification}},
{{AggregationLimits}} and {{AggregationQueryPager}} would be more clear?
* The logic in {{SelectStatement.RawStatement.getGroupBySpecification}} breaks (doesn't throw
a "proper" exception) if someone write {{SELECT * FROM t GROUP BY a, a}} where {{a}} is the
sole PK column (the query is stupid, but it should throw a proper exception). More generally,
I feel that the logic don't make it too intuitive that we're indeed checking  proper ordering
of the column. As a nit, instead of copying both partition and clustering columns in {{CFMetaData.primaryKeyColumns}},
I'd use {{Iterables.concat}} instead (and use an iterator in getGroupBySpecification, which
would make it even more apparent than not checking {{hasNext}} is a bad idea).
* Having {{CQLGroupByLimits}} inherit {{estimateTotalResults}} from {{CQLLimits}} is theoretically
wrong as we should return an estimate of groups. Now, properly doing that is hard and so for
now I'm fine reusing the one of {{CQLLimits}} for simplicity but we should call this out clearly
by overriding the method (having it just call {{super.estimateTotalResults()}}) with a comment
explaining we know it's off and should ideally be fixed someday. In fact, I wonder if having
{{CQLGroupByLimits}} inherit {{CQLLimits}} is a good idea: they work sufficiently differently
that it feels risky to have method inherited silently. The only other method not overriden
is {{hasEnoughLiveData}} and we could easily use a {{private static}} helper for code reuse
in that case. Anyway, I don't mind the inheritence terribly if you prefer but figured I'd
mention it.
* Not sure about {{CQLGroupByLimits.forShortReadRetry()}}. I believe putting no limit on the
number of rows (and only on the group) might lead to OOM.  In fact, I need to think more carefully
about this but I'm not 100% sure that the short read logic isn't throw off by the fact that
{{counted()}} returns a number of groups not rows.
* In {{GroupByQueryPager}}, both iterator implementation have a {{correctPageSize()}} method
that is unused. And in the case of {{AggregationPartitionIterator}}, it seems fishy that it's
not used.
* Shouldn't {{GroupByQueryPager.GoupByPartitionIterator.close()}} set {{closed}} if it isn't
set?
* I think we lack some validation/handling of composite partition keys. If I have {{PRIMARY
KEY ((a, b), c)}} and I do {{SELECT count( *) FROM t GROUP BY a}}, I believe this silently
do the grouping by both {{a}} and {{b}}. We should reject this since we have no way to handle
it.


Also, a bunch of minor nits:
* {{GroupByQueryPager.GroupByPartitionIterator.GroupByRowIterator}} could call {{rowIterator.partitionKey()}}
rather than stores the partition key separately. Regarding that class, we also should remove
all the {{@Override}} according to the [code style|https://wiki.apache.org/cassandra/CodeStyle].
* Why do we have {{rowPageSize}} in {{CQLGroupByLimits}}. The ctor seems to always make it
equal to {{rowLimit}} and both fields are {{final}} (so will stay equal).
* In {{Selection.resultSetBuilder}}, the {{isJons}} parameter is a typo for {{isJson}}. I
know this is not new to this patch but lets fix it rather than duplicate the typo.
* Your IDE is not respecting the import order of the code-style. It moved {{com.google}} imports
after the {{org.apache}} ones in {{Selection.java}} for instance (see the [code style|https://wiki.apache.org/cassandra/CodeStyle]).
I also don't love that it expands {{*}} as this adds noise to patches but that might be more
personal.
* We can now remove the {{// Note that if there are some nodes in the cluster with a version
less than 2.0, we can't use paging (CASSANDRA-6707)}} line (in {{SelectStatement.getDataLimits}}).
* Could add a comment in {{SelectStatement.RawStatement.getGroupBySpecification()}} as to
why we don't do anything with partition key columns.
* Was there a reason to change {{s/cqlRowLimit/userLimit}} in the last 2 lines of {{SelectStatement.getDataLimits()}}?
Pretty sure it doesn't matter but since I don't find the change particularly better I'm left
wondering if I'm missing the reason for the change.
* Could be wrong, but it doesn't seem we need to add {{withUpdatedLimit()}} to {{ReadQuery}}
(that is, we only use it on a {{ReadCommand}}). If true, I'd rather remove it from {{ReadQuery}}
which allow to remove the unused implementation in {{SinglePartitionReadCommand.Group}}.
* Indentation for the parameters of {{GroupMaker.newInstance}} is off.
* It would make more sense for {{QueryPager.EMPTY.withUpdatedLimit()}} to throw {{UnsupportedOperationException}}.
* In {{DataLimits.CQLLimits.hasEnoughLiveData}}, a comment has been butchered.
* Some indentation is off in {{DataLimits.Serializer.deserialize}}.
* In {{GroupByQueryPager.handlePagingOff}}, the comment is incomplete.
* Unecessary import of {{Optional}} in {{Restrictions.java}}.


> Add support for Group By to Select statement
> --------------------------------------------
>
>                 Key: CASSANDRA-10707
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support {{GROUP BY}}
on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the clustering column
level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP BY partitionKey,
clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message