GROUP BY "feature",
I would not think of it like that, this is about physical order of rows.  

since it seems really important yet does not seem to be mentioned in the
CQL reference documentation.
It's baked in, this is how the data is organised on the row. 

http://www.datastax.com/dev/blog/thrift-to-cql3
We often say the PRIMARY KEY is the PARTITION KEY and the GROUPING COLUMNS
http://www.datastax.com/documentation/cql/3.0/webhelp/index.html#cql/cql_reference/create_table_r.html

See also http://thelastpickle.com/blog/2013/01/11/primary-keys-in-cql.html

Is it something we can bet the farm and farmer's family on?
Sure. 

The kinds of scenarios where I am wondering if it's possible for partition-key groups
to get intermingled are :
All instances of the table entity with the same value(s) for the PARTITION KEY portion of the PRIMARY KEY existing in the same storage engine row. 

  .   what if the node containing primary copy of a row is down
There is no primary copy of a row. 

  .   what if there is a heavy stream of UPDATE activity from applications which
      connect to all nodes,   causing different nodes to have different versions of replicas of same row?
That's fine with me. 
It's only an issue when the data is read, and at that point the Consistency Level determines what we do. 

Hope that helps. 


-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/09/2013, at 7:43 AM, John Lumby <johnlumby@hotmail.com> wrote:

I would like to make quite sure about this implicit GROUP BY "feature",

since it seems really important yet does not seem to be mentioned in the
CQL reference documentation.



Aaron,   you said "yes"  --   is that "yes,  always,   in all scenarios no matter what"

or "yes usually"?      Is it something we can bet the farm and farmer's family on?



The kinds of scenarios where I am wondering if it's possible for partition-key groups
to get intermingled are :



  .   what if the node containing primary copy of a row is down
                and
cassandra fetches this row from a replica on a different node
               (e.g.  with CONSISTENCY ONE)

  .   what if there is a heavy stream of UPDATE activity from applications which
      connect to all nodes,   causing different nodes to have different versions of replicas of same row?



Can you point me to some place in the cassandra source code where this grouping is ensured?



Many thanks,

John Lumby