incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: CQL3 dynamic CF design questions
Date Tue, 09 Apr 2013 07:43:36 GMT
> Which would lead to the storage engine storing rows similar to this (AFAIK):
> 
> 8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d
> - basic_info:propertyset_val = { firstName:"john", lastName:"smith", ...}
> - contact_info:propertyset_val = { address: {streetAddr:"1 infinite loop", postalCode:
""}, ... }
> - meal_prefs:propertyset_val = { … }
If the internal layout ye wish to see, cast ye fate into the cassandra-cli 

cqlsh:dev> insert into user (id, propertyset_key, propertyset_val) values (8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d,
'monkey', 'purple');
cqlsh:dev> select * from user;

 id                                   | propertyset_key | propertyset_val
--------------------------------------+-----------------+-----------------
 8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d |          monkey |          purple

CLI...

[default@dev] get user[8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d];
=> (column=monkey:, value=, timestamp=1365482931142000)
=> (column=monkey:propertyset_val, value=707572706c65, timestamp=1365482931142000)
Returned 2 results.
Elapsed time: 32 msec(s).
[default@dev] 

The extra column is there to scare away the range ghosts that come and stay and stare at you
in the night. 


[default@dev] get user2[8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d];
=> (column=, value=, timestamp=1365492971141000)
=> (column=property_sets:6d6f6e6b6579, value=707572706c65, timestamp=1365492971141000)
Returned 2 results.

> so are these user table definitions equivalent from the storage engine perspective?

> but is it possible to accomplish that somehow with the first definition?

You can 

cqlsh:dev> select * from user where id = 8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d and propertyset_key
= 'monkey';

 id                                   | propertyset_key | propertyset_val
--------------------------------------+-----------------+-----------------
 8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d |          monkey |          purple

But if you try to do it without the primary key are asking for a range scan and will be scolded.



cqlsh:dev> select * from user where propertyset_key = 'monkey';
Bad Request: Cannot execute this query as it might involve data filtering and thus may have
unpredictable performance. If you want to execute this query despite the performance unpredictability,
use ALLOW FILTERING

> Or should I create separate column families for static and dynamic properties instead?
One downside to collection types is that they must be read in their entirety, I'd in this
case I would avoid them so you can read particular items. 

CQL 3 does not do well with mixed static / dynamic tables, if you can make a static CF and
select against it specifying the columns you want you'll get the best performance. 

Hope that helps. 
 

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/04/2013, at 8:11 PM, Marko Asplund <marko.asplund@gmail.com> wrote:

> Hi,
> 
> I'm currently designing a backend service that would store user profile information for
different applications. Most of the properties in a user profile would be unknown to the service
and specified by the applications using the service, so the properties would need to be added
dynamically.
> 
> I was planning to use CQL3 and a dynamic column family defined something like this:
> 
> CREATE TABLE user (
>   id UUID,
>   propertyset_key TEXT,
>   propertyset_val TEXT,
>   PRIMARY KEY (id, propertyset_key)
> );
> 
> There would be N (assuming < 50) property sets associated with a user.
> The property set values would be complex object graphs represented as JSON.
> 
> Which would lead to the storage engine storing rows similar to this (AFAIK):
> 
> 8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d
> - basic_info:propertyset_val = { firstName:"john", lastName:"smith", ...}
> - contact_info:propertyset_val = { address: {streetAddr:"1 infinite loop", postalCode:
""}, ... }
> - meal_prefs:propertyset_val = { ... }
> - ...
> 
> Any comments on this design?
> 
> Another option would be to use the Cassandra map type for storing property sets like
this:
> 
> CREATE TABLE user (
>   id UUID,
>   property_sets MAP<text, text>,
>   PRIMARY KEY (id)
> );
> 
> Based on the documentation I understood that each map element would internally be stored
as separate a column, so are these user table definitions equivalent from the storage engine
perspective?
> 
> I'm using Astyanax which seems to support Cassandra collections.
> 
> With the second definition, it should be possible to later migrate a dynamic property
e.g. job_title to a static property, so that I could execute CQL queries like this:
> 
> SELECT * FROM user WHERE job_title = 'developer';
> 
> but is it possible to accomplish that somehow with the first definition?
> Or should I create separate column families for static and dynamic properties instead?
> 
> 
> marko


Mime
View raw message