cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
Date Tue, 10 Jan 2012 17:14:43 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183386#comment-13183386
] 

Sylvain Lebresne commented on CASSANDRA-2474:
---------------------------------------------

I guess my argument is that while PK does identify uniquely a record, which fits the SQL notion
for that part, we will have a number of difference with how a PK works in SQL. Namely the
difference is that the order of the argument in the PRIMARY KEY definition matter, and quite
a lot actually.

To be concrete, if you define:
{noformat}
CREATE TABLE table (
    X text;
    Y text;
    Z text;
    V int;
    PRIMARY KEY (X, Y, Z);
) WITH COMPACT STORAGE;
{noformat}
then this is not at all the same than having {{PRIMARY KEY (X, Z, Y)}} or even {{PRIMARY KEY
(Z, Y, X)}}, and in particular:
* you will be able to do {{SELECT * WHERE X = 'foo' AND Y = 'bar'}} but not {{SELECT * WHERE
X = 'foo' AND X = 'bar'}}
* you will be able to do {{INSERT INTO table (X, Y, V) values ('foo', 'bar', 3)}}, but not
{{INSERT INTO table (X, Z, V) values ('foo', 'bar', 3)}}.

There is also the sorting, which is not imo an implementaion detail because it is absolutely
fundamental for the wide row case.

I guess there is two points:
# If a concept we introduce is different in reasonably subtle ways to a SQL concept, we probably
better avoid reusing the same name. I feel it'll be less confusing to have people ask upfront
what is a given (unknown) concept, and to explain it saying that it is close to some other
SQL concept but with given differences, rather that having everyone think they know how it
work based on name and the superficial similarities and get beaten by the differences later
on. That's a argument for renaming PK to something else.
# The sorting is a rather important concept in the case of wide rows. And we don't sort on
the row key. So it feels that splitting the PK into two concept would make the syntax more
informative. On the one side we would lose a bit on the intuition of what uniquely identify
a record, but we'll win on the sorting intuition. And the latter seems a more important concept
to me (in C* that is), one that have more consequences on what you can/cannot do.

Now if I'm the only one to think that maybe the PK notation may end up being more confusing
than helpful and does not convey important notion specific to C*, then I'll shut up.
                
> CQL support for compound columns
> --------------------------------
>
>                 Key: CASSANDRA-2474
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Eric Evans
>            Assignee: Sylvain Lebresne
>              Labels: cql
>             Fix For: 1.1
>
>         Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 2474-transposed-select-no-sparse.PNG,
2474-transposed-select.PNG, raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of compound column
names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to
create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message