cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-4421) Support cql3 table definitions in Hadoop InputFormat
Date Fri, 24 May 2013 17:02:23 GMT


Jonathan Ellis commented on CASSANDRA-4421:

I think there are two sane alternatives for the reader.  We could expose {{RecordReader<List<ByteBuffer>,
List<ByteBuffer>>}} and assume the caller can figure out what his PK definition is,
and what columns he asked for and therefore what the List items correspond to.

Alternatively we could expose {{RecordReader<Map<String, ByteBuffer>, Map<String,
ByteBuffer>>}}, which makes it a lot harder for the caller to screw things up, while
also making it more convenient.  (The only reason the original CFRR presents a Map as the
value is for convenience in referring to columns by name.)

(The Map should be a LinkedHashMap to preserve order as well.)

The best argument for sticking with the List is that we're basically forced to use a List
for the Writer's bind variables, since we don't support named parameters in CQL.  Which would
imply {{RecordWriter<List<ByteBuffer>, <List<ByteBuffer>>}}.  The key
should be a List, since we can have compound PKs and we don't want to force people to turn
those into a single BB via CompositeType.  And the value should just be a single list of bind
variables because the list-of-lists is a hold over from the original CFRW. (Where TBH I don't
think it made sense either but we're kind of stuck with it no for backwards compatibility.)

Or, we could do {{RecordWriter<Map<String, ByteBuffer>, <List<ByteBuffer>>}},
for consistency with a Map-based Reader.


> Support cql3 table definitions in Hadoop InputFormat
> ----------------------------------------------------
>                 Key: CASSANDRA-4421
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>    Affects Versions: 1.1.0
>         Environment: Debian Squeeze
>            Reporter: bert Passek
>              Labels: cql3
>             Fix For: 1.2.6
>         Attachments: 4421-1.txt, 4421-2.txt, 4421-3.txt, 4421-4.txt, 4421-5.txt, 4421.txt
> Hello,
> i faced a bug while writing composite column values and following validation on server
> This is the setup for reproduction:
> 1. create a keyspace
> create keyspace test with strategy_class = 'SimpleStrategy' and strategy_options:replication_factor
= 1;
> 2. create a cf via cql (3.0)
> create table test1 (
>     a int,
>     b int,
>     c int,
>     primary key (a, b)
> );
> If i have a look at the schema in cli i noticed that there is no column metadata for
columns not part of primary key.
> create column family test1
>   with column_type = 'Standard'
>   and comparator = 'CompositeType(org.apache.cassandra.db.marshal.Int32Type,org.apache.cassandra.db.marshal.UTF8Type)'
>   and default_validation_class = 'UTF8Type'
>   and key_validation_class = 'Int32Type'
>   and read_repair_chance = 0.1
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = true
>   and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and caching = 'KEYS_ONLY'
>   and compression_options = {'sstable_compression' : ''};
> Please notice the default validation class: UTF8Type
> Now i would like to insert value > 127 via cassandra client (no cql, part of mr-jobs).
Have a look at the attachement.
> Batch mutate fails:
> InvalidRequestException(why:(String didn't validate.) [test][test1][1:c] failed validation)
> A validator for column value is fetched in ThriftValidation::validateColumnData which
returns always the default validator which is UTF8Type as described above (The ColumnDefinition
for given column name "c" is always null)
> In UTF8Type there is a check for
> if (b > 127)
>    return false;
> Anyway, maybe i'm doing something wrong, but i used cql 3.0 for table creation. I assigned
data types to all columns, but i can not set values for a composite column because the default
validation class is used.
> I think the schema should know the correct validator even for composite columns. The
usage of the default validation class does not make sense.
> Best Regards 
> Bert Passek

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message