cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4377) CQL3 column value validation bug
Date Thu, 09 Aug 2012 16:42:19 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431965#comment-13431965
] 

Sylvain Lebresne commented on CASSANDRA-4377:
---------------------------------------------

I'm not sure I understand what's a named columns above to be honest.

There is basically two informations from CFMetadata you need to know to insert a column correctly
in a table (CQL3 or no CQL3): the comparator and *all* of column_metadata. The comparator
is necessary to know what is a valid column name and the column_metadata is necessary to know
what is a valid column value (I'm simplifying a bit, I'm assuming that the key_validation
and default_validator are BytesType but that doesn't matter for the problem at hand).

Now the problem is that for any table created through CQL3 that doesn't use COMPACT STORAGE
(let's call those CQL3 tables), all the ColumnDefinition of column_metada will have a componentIndex.
So none of those ColumnDefinition are exposed in thrift. In practice it means that if I do:
{noformat}
CREATE TABLE user {
    user_id blob PRIMARY KEY,
    name text,
    age int
}
{noformat}
then if a thrift client do a describe, it will basically get:
{noformat}
comparator = CompositeType(UTF8Type) // it's a composite so that we can add collection later
on
column_metadata = []
{noformat}

At that point we have two slightly separate problems:
# Even if a user produces a valid column, with say a composite name being "age" and a value
being an int, then currently the code throw an exception. Fixing that exception is the goal
of the attached patch (though it would have to be updated to work with collections in 1.2).
I'm fine fixing that, though I'm pointing that there is a second, more general problem.
# Since the thrift client doesn't know about the actual column_metadata, how can we expect
it to correctly insert data. In particular I'm pretty sure higher level clients like pycassa
or astyanax will serialize data incorrectly if they don't know the right value validator.
Besides, there is many way to be confused if you use a CQL3 table from thrift. For instance
if you create the wrong column (i'ts enough to mess up the case), you'll be surprised to not
be able to access it when you go back to CQL3. So be clear, I do am suggesting that we don't
allow accessing table created from CQL3 *without* COMPACT STORAGE from thrift, because I think
it will be more sane, even if it does mean that you're not coming back from CQL3 once you've
start really using it.

                
> CQL3 column value validation bug
> --------------------------------
>
>                 Key: CASSANDRA-4377
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4377
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: Nick Bailey
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.4
>
>         Attachments: 4377.txt
>
>
> {noformat}
> cqlsh> create keyspace test with strategy_class = 'SimpleStrategy' and strategy_options:replication_factor
= 1;
> cqlsh> use test;
> cqlsh:test> CREATE TABLE stats (
>         ...   gid          blob,
>         ...   period     int,
>         ...   tid          blob, 
>         ...   sum        int,
>         ...   uniques           blob,
>         ...   PRIMARY KEY(gid, period, tid)
>         ... );
> cqlsh:test> describe columnfamily stats;
> CREATE TABLE stats (
>   gid blob PRIMARY KEY
> ) WITH
>   comment='' AND
>   comparator='CompositeType(org.apache.cassandra.db.marshal.Int32Type,org.apache.cassandra.db.marshal.BytesType,org.apache.cassandra.db.marshal.UTF8Type)'
AND
>   read_repair_chance=0.100000 AND
>   gc_grace_seconds=864000 AND
>   default_validation=text AND
>   min_compaction_threshold=4 AND
>   max_compaction_threshold=32 AND
>   replicate_on_write='true' AND
>   compaction_strategy_class='SizeTieredCompactionStrategy' AND
>   compression_parameters:sstable_compression='SnappyCompressor';
> {noformat}
> You can see in the above output that the stats cf is created with the column validator
set to text, but neither of the non primary key columns defined are text. It should either
be setting metadata for those columns or not setting a default validator or some combination
of the two.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message