cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
Date Mon, 05 Sep 2011 14:32:10 GMT


Sylvain Lebresne commented on CASSANDRA-2474:

I do agree with Eric earlier on, I think this issue could stand being summarized, I'm not
too sure I understand what is proposed here so far. So I apologize in advance if it turns
out the propositions made above do answer everything that is below.

However, it seems that we're focusing on some representation based on materialized views here.
Did we focus on that because we consider the basic use cases for composite type, those where
we don't use them for materialized view at all, are easy to deal with ?

Why not consider composite column name for what they are, *one* column name that is composed
of multiple sub-elements ? What I mean here is, I'm not that sure I'm convinced that
bq. the original idea from CASSANDRA-2025 of "SELECT columnA:x, columnA:y FROM foo WHERE key
= 'bar'" is the wrong way to go

I'm even less convinced when I see the number of comments on this ticket.

Again, there seems that the focus was exclusively on materialized views, but I strongly think
that composite column names are useful for more than materialized view (I've used composite
column names countless time, never for materialized view).

But let's take an example of what I mean. Suppose that what you store in your column family
are events. Those events arrive with a timestamp whose resolution is maybe the minute (or
more precisely, you only care about query them at that precision). Those events have a category
(that may have a sorting that make sense), and maybe a subcategory. They also have a unique
identifier eventId. Moreover there is a lot of events every minutes and the category/subcategory
are not necessarily predefined. The query you want to do are typically:
  * Give me all the events for time t, category c and sub-category sc.
  * Give me all the events for time t and category c.
  * Give me all the events for time t and category c1 to c2 (where c1 < c2 for the category
  * Give me everything for the last 4 hours
Probably most of those would requires paging because there is shit tons of events but still,
I want to do those fast.

I haven't found a better data model for that kind of example than using a composite column
name where the name is (timestamp, category, sub-category, eventId).

I haven't found in all the discussion above anything that would allow me to do this better
than what is in the initial proposition of CASSANDRA-2025.

Now I completely agree that having a good notation to work with materialized view would be
great, but IMO if we try to find a syntax that is too far from how composite columns work,
I fear we'll end up limiting the usefulness of composite types in CQL to one narrow use case.

I'll note too that I haven't seen any proposal of how insertion with compound types should
look like.

> CQL support for compound columns
> --------------------------------
>                 Key: CASSANDRA-2474
>                 URL:
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: API, Core
>            Reporter: Eric Evans
>            Assignee: Pavel Yaskevich
>              Labels: cql
>             Fix For: 1.0
>         Attachments: screenshot-1.jpg, screenshot-2.jpg
> For the most part, this boils down to supporting the specification of compound column
names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to
create structures from the results.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message