Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7A918C103 for ; Thu, 9 Aug 2012 16:42:20 +0000 (UTC) Received: (qmail 36348 invoked by uid 500); 9 Aug 2012 16:42:20 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 36307 invoked by uid 500); 9 Aug 2012 16:42:20 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 36197 invoked by uid 99); 9 Aug 2012 16:42:20 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Aug 2012 16:42:20 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id E8885142819 for ; Thu, 9 Aug 2012 16:42:19 +0000 (UTC) Date: Thu, 9 Aug 2012 16:42:19 +0000 (UTC) From: "Sylvain Lebresne (JIRA)" To: commits@cassandra.apache.org Message-ID: <1952908870.2450.1344530539956.JavaMail.jiratomcat@issues-vm> In-Reply-To: <967026758.57162.1340729084864.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (CASSANDRA-4377) CQL3 column value validation bug MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431965#comment-13431965 ] Sylvain Lebresne commented on CASSANDRA-4377: --------------------------------------------- I'm not sure I understand what's a named columns above to be honest. There is basically two informations from CFMetadata you need to know to insert a column correctly in a table (CQL3 or no CQL3): the comparator and *all* of column_metadata. The comparator is necessary to know what is a valid column name and the column_metadata is necessary to know what is a valid column value (I'm simplifying a bit, I'm assuming that the key_validation and default_validator are BytesType but that doesn't matter for the problem at hand). Now the problem is that for any table created through CQL3 that doesn't use COMPACT STORAGE (let's call those CQL3 tables), all the ColumnDefinition of column_metada will have a componentIndex. So none of those ColumnDefinition are exposed in thrift. In practice it means that if I do: {noformat} CREATE TABLE user { user_id blob PRIMARY KEY, name text, age int } {noformat} then if a thrift client do a describe, it will basically get: {noformat} comparator = CompositeType(UTF8Type) // it's a composite so that we can add collection later on column_metadata = [] {noformat} At that point we have two slightly separate problems: # Even if a user produces a valid column, with say a composite name being "age" and a value being an int, then currently the code throw an exception. Fixing that exception is the goal of the attached patch (though it would have to be updated to work with collections in 1.2). I'm fine fixing that, though I'm pointing that there is a second, more general problem. # Since the thrift client doesn't know about the actual column_metadata, how can we expect it to correctly insert data. In particular I'm pretty sure higher level clients like pycassa or astyanax will serialize data incorrectly if they don't know the right value validator. Besides, there is many way to be confused if you use a CQL3 table from thrift. For instance if you create the wrong column (i'ts enough to mess up the case), you'll be surprised to not be able to access it when you go back to CQL3. So be clear, I do am suggesting that we don't allow accessing table created from CQL3 *without* COMPACT STORAGE from thrift, because I think it will be more sane, even if it does mean that you're not coming back from CQL3 once you've start really using it. > CQL3 column value validation bug > -------------------------------- > > Key: CASSANDRA-4377 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4377 > Project: Cassandra > Issue Type: Bug > Affects Versions: 1.1.1 > Reporter: Nick Bailey > Assignee: Sylvain Lebresne > Fix For: 1.1.4 > > Attachments: 4377.txt > > > {noformat} > cqlsh> create keyspace test with strategy_class = 'SimpleStrategy' and strategy_options:replication_factor = 1; > cqlsh> use test; > cqlsh:test> CREATE TABLE stats ( > ... gid blob, > ... period int, > ... tid blob, > ... sum int, > ... uniques blob, > ... PRIMARY KEY(gid, period, tid) > ... ); > cqlsh:test> describe columnfamily stats; > CREATE TABLE stats ( > gid blob PRIMARY KEY > ) WITH > comment='' AND > comparator='CompositeType(org.apache.cassandra.db.marshal.Int32Type,org.apache.cassandra.db.marshal.BytesType,org.apache.cassandra.db.marshal.UTF8Type)' AND > read_repair_chance=0.100000 AND > gc_grace_seconds=864000 AND > default_validation=text AND > min_compaction_threshold=4 AND > max_compaction_threshold=32 AND > replicate_on_write='true' AND > compaction_strategy_class='SizeTieredCompactionStrategy' AND > compression_parameters:sstable_compression='SnappyCompressor'; > {noformat} > You can see in the above output that the stats cf is created with the column validator set to text, but neither of the non primary key columns defined are text. It should either be setting metadata for those columns or not setting a default validator or some combination of the two. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira