cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-2474) CQL support for compound columns
Date Tue, 20 Dec 2011 14:29:31 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne updated CASSANDRA-2474:
----------------------------------------

    Attachment: raw_composite.txt

bq. True, but none of the other proposals even come close to being as friendly as this one
for typical cases

Playing devils advocate I would say that 'sucking much less' doesn't necessarily make it 'the
right solution'.

Now, don't get me wrong, I like the TRANSPOSED idea for composite. But I think you made 2
proposals:
# a reasonably generic way to access CF with composite comparator in a CQL-ish way: the TRANSPOSED
part.
# an attempt at some special handling for the case of composites where the last component
takes only a know number of values: the SPARSE thing.

I do like the first part. Though I'd like to mention some remarks on the following comment:

bq. We're using TRANSPOSED AS similarly to how databases have used storage hints like CLUSTERED.
It doesn't affect the relational model of the data, but it gives you different performance
characteristics

While I understand what you mean, I don't think it's completely true. Because in the transposed
case, the order of definition matters, which has a consequence on what you can do, both in
terms of writes and reads. 
Consider the two definitions:
{noformat}
CREATE TABLE test1 (
    key text primary key,
    prop1 int,
    prop2 int,
    prop3 int
)
{noformat}
and
{noformat}
CREATE TABLE test2 (
    key int primary key,
    prop1 int,
    prop2 int,
    prop3 int
) TRANSPOSED AS (prop1, prop2)
{noformat}
Those two definitions don't only differ from a performance standpoint. Typically, you can
do
{noformat}
UPDATE test1 SET prop2 = 42 WHERE key = 'someKey';
{noformat}
but you cannot do the same query on test2. Btw, for test2, you don't necessarily have to specify
prop2 for every row, but you need at least prop1 and prop3 each time. My point being that
you do have to understand a bit what is going on underneath to understand the limitation we
will have to put on this.

You also have the similar thing for gets: you can do
{noformat}
SELECT prop2 FROM test1 WHERE key = 'someKey';
{noformat}
but this make no sense with test2 (or rather there is no way we can do this efficiently, i.e,
without reading the row fully).

That being said, I'll reiter that I'm reasonably convinced by this transposition notion, even
though I'll probably prefer to write it as
{noformat}
CREATE TRANSPOSED TABLE test2 (
    key int primary key,
    prop1 int,
    prop2 int,
    prop3 int
)
{noformat}
as was suggested in some comments above.

On the SPARSE thing, I am much less convinced that this is the right solution. I think that
having at the same 'level' variables that are just names to identify values in the resultset
(posted_at) and literals (posted_by) is confusing (and ugly). (As a side note, I don't "understand"
the choice of the SPARSE word).

Overall, I'm afraid we'll end up doing some bad choice by trying to do too much at once. The
first problem we have is that CQL, that we'd like to push as the de-facto way to access Cassandra,
doesn't allow access to composite columns at all. It seems to me that the transposed alone
fixes that (again, except for the dynamic composite type). The SPARSE don't add any new possibility,
it just adds a presumably better syntax for a specific case. I would be in favor of moving
this to a second step (which would be less urgent and would allow refocusing the discussion
on that very specific optimisation). 

Lastly, and for the record, I would actually be in favor of having the first step on this
being the addition of a very simple 'raw' notation to access composites. Something that could
look like the example in the attached file 'raw_composite.txt' (put separatly because this
comment is way too long already). The advantages being that: it's super simple to do, it'll
be natural for users coming from thrift and it'll have not specific limitation (in particular
it'll handle dynamic composites). Then, a second step would be to add more limited but more
user-friendly notation to deal with specific cases (like the transposed and the sparse thing).
                
> CQL support for compound columns
> --------------------------------
>
>                 Key: CASSANDRA-2474
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Eric Evans
>            Assignee: Pavel Yaskevich
>              Labels: cql
>             Fix For: 1.1
>
>         Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 2474-transposed-select.PNG,
raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of compound column
names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to
create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message