incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Dynamic Columns Question Cassandra 1.2.5, Datastax Java Driver 1.0
Date Thu, 06 Jun 2013 14:59:22 GMT
The problem about "being careful about how much you store in a collection"
is that Cassandra is a blind-write system. Knowing how much data is
currently in the collection before you write is an anti-pattern, read
before write.

Cassandra Rule 1: DON'T READ BEFORE WRITE
Cassandra Rule 2: ROWS CAN HAVE 2 BILLION COLUMNS
Collection Rule 1: DON'T STORE MORE THEN 100 THINGS IN A COLLECTION

Why does are user confused? Its simple.









On Thu, Jun 6, 2013 at 10:51 AM, Eric Stevens <mightye@gmail.com> wrote:

> CQL3 does now support dynamic columns. For tags or metadata values you
>> could use a Collection:
>>
>
> This should probably be clarified.  A collection is a super useful tool,
> but it is *not* the same thing as a dynamic column.  It has many
> advantages, but there is one huge disadvantage in that you have to be
> careful how much data you store in a collection. When you read a single
> value out of a collection, the *entire* collection is always read, which
> of course is true for appending data to the collection as well.
>
> With a traditional dynamic column, you could have added things like event
> logs to a record in the form of keys named "event:someEvent:TS" (or
> juxtapose the order as your needs dictate).  You could basically do this
> practically indefinitely with little degradation in performance.  This was
> also a common way of representing cross-family relationships (one-to-many
> style).
>
> If you try to do the same thing with a collection, performance will
> degrade as your data grows.  For small or relatively static data sets (eg
> tags) that's fine.  For open-ended data sets (logs, events, one-to-many
> relationships that grow regularly), you should instead normalize such data
> into a separate column family.
>
> -Eric Stevens
> ProtectWise, Inc.
>
>
> On Thu, Jun 6, 2013 at 9:49 AM, Francisco Andrades Grassi <
> bigjocker@gmail.com> wrote:
>
>> Hi,
>>
>> CQL3 does now support dynamic columns. For tags or metadata values you
>> could use a Collection:
>>
>> http://www.datastax.com/dev/blog/cql3_collections
>>
>> For wide rows there's the enhanced primary keys, which I personally
>> prefer over the composite columns of yore:
>>
>> http://www.datastax.com/dev/blog/cql3-for-cassandra-experts
>> http://thelastpickle.com/2013/01/11/primary-keys-in-cql/
>>
>> --
>> Francisco Andrades Grassi
>> www.bigjocker.com
>> @bigjocker
>>
>> On Jun 6, 2013, at 8:32 AM, Joe Greenawalt <joe.greenawalt@gmail.com>
>> wrote:
>>
>> Hi,
>> I'm having some problems figuring out how to append a dynamic column on a
>> column family using the datastax java driver 1.0 and CQL3 on Cassandra
>> 1.2.5.  Below is what i'm trying:
>>
>> *cqlsh:simplex> create table user (firstname text primary key, lastname
>> text);
>> cqlsh:simplex> insert into user (firstname, lastname) values
>> ('joe','shmoe');
>> cqlsh:simplex> select * from user;
>>
>>  firstname | lastname
>> -----------+----------
>>        joe |    shmoe
>>
>> cqlsh:simplex> insert into user (firstname, lastname, middlename) values
>> ('joe','shmoe','lester');
>> Bad Request: Unknown identifier middlename
>> cqlsh:simplex> insert into user (firstname, lastname, middlename) values
>> ('john','shmoe','lester');
>> Bad Request: Unknown identifier middlename*
>>
>> I'm assuming you can do this based on previous based thrift based clients
>> like pycassa, and also by reading this:
>>
>> The Cassandra data model is a dynamic schema, column-oriented data model.
>> This means that, unlike a relational database, you do not need to model all
>> of the columns required by your application up front, as each row is not
>> required to have the same set of columns. Columns and their metadata can be
>> added by your application as they are needed without incurring downtime to
>> your application.
>> here: http://www.datastax.com/docs/1.2/ddl/index
>>
>> Is it a limitation of CQL3 and its connection vs. thrift?
>> Or more likely i'm just doing something wrong?
>>
>> Thanks,
>> Joe
>>
>>
>>
>

Mime
View raw message