incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Shutyaev <shuty...@gmail.com>
Subject Re: mysterious 'column1' in cql describe
Date Fri, 30 Aug 2013 10:31:04 GMT
Thanks, Sylvain! I'll read it most thoroughly but after a quick glance I
wish to repeat my another (implied) question that I believe will not be
answered in these articles.

Why does the explicit definition of columns in a column family
significantly improve performance and key cache hit ratio (the last one
being almost zero when there are no explicit column definitions)?


2013/8/30 Sylvain Lebresne <sylvain@datastax.com>

> The short story is that you're probably not up to date on how CQL and
> thrift table definition relate to one another, and that may not be exactly
> how you think it does. If you haven't done so, I'd suggest the reading of
> http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows(should answer
your "what about dynamic column name" case) and
> http://www.datastax.com/dev/blog/thrift-to-cql3 (should help explain how
> CQL3 interprets thrift table, and why your saw what you saw).
>
> --
> Sylvain
>
>
> On Fri, Aug 30, 2013 at 9:50 AM, Alexander Shutyaev <shutyaev@gmail.com>wrote:
>
>> Hi all!
>>
>> We have encountered the following problem. We create our column families
>> via hector like this:
>>
>> ColumnFamilyDefinition cfdef = HFactory.createColumnFamilyDefinition(*
>> "mykeyspace"*, *"mycf"*);
>> cfdef.setColumnType(ColumnType.*STANDARD*);
>> cfdef.setComparatorType(ComparatorType.*UTF8TYPE*);
>> cfdef.setDefaultValidationClass(*"BytesType"*);
>>  cfdef.setKeyValidationClass(*"UTF8Type"*);
>> cfdef.setReadRepairChance(0.1);
>> cfdef.setGcGraceSeconds(864000);
>> cfdef.setMinCompactionThreshold(4);
>> cfdef.setMaxCompactionThreshold(32);
>> cfdef.setReplicateOnWrite(*true*);
>> cfdef.setCompactionStrategy(*"SizeTieredCompactionStrategy"*);
>> Map<String, String> compressionOptions = *new* HashMap<String, String>();
>> compressionOptions.put(*"sstable_compression"*, *""*);
>> cfdef.setCompressionOptions(compressionOptions);
>> cluster.addColumnFamily(cfdef, *true*);
>>
>> When we *describe *this column family via *cqlsh* we get this
>>
>> CREATE TABLE "mycf" (
>>   key text,
>>   column1 text,
>>   value blob,
>>   PRIMARY KEY (key, column1)
>> ) WITH COMPACT STORAGE AND
>>   bloom_filter_fp_chance=0.010000 AND
>>   caching='KEYS_ONLY' AND
>>   comment='' AND
>>   dclocal_read_repair_chance=0.000000 AND
>>   gc_grace_seconds=864000 AND
>>   read_repair_chance=0.100000 AND
>>   replicate_on_write='true' AND
>>   populate_io_cache_on_flush='false' AND
>>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>>   compression={};
>>
>> As you can see there is a mysterious *column1* and moreover it is added
>> to the primary key. We've thought it wrong so we've tried getting rid of
>> it. We've managed to do it by adding explicit column definitions like this:
>>
>> BasicColumnDefinition cdef = new BasicColumnDefinition();
>> cdef.setName(StringSerializer.get().toByteBuffer(*"mycolumn"*));
>> cdef.setValidationClass(ComparatorType.*BYTESTYPE*.getTypeName());
>> cdef.setIndexType(ColumnIndexType.*CUSTOM*);
>> cfdef.addColumnDefinition(cDef);
>>
>> After this the primary key was like
>>
>> PRIMARY KEY (key)
>>
>> The effect of this was *overwhelming* - we got a tremendous performance
>> improvement and according to stats, the key cache began working while
>> previously its hit ratio was close to zero.
>>
>> My questions are
>>
>> 1) What is this all about? Is what we did right?
>> 2) In this project we can provide explicit column definitions. But in
>> another project we have some column families where this is not possible
>> because column names are dynamic (based on timestamps). If what we did is
>> right - how can we adapt this solution to the dynamic column name case?
>>
>
>

Mime
View raw message