incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Burton" <rkevinbur...@charter.net>
Subject RE: CF metadata syntax for an array
Date Thu, 15 Nov 2012 03:18:52 GMT
In the below example I am thinking that id is the order id. Would there be
considerable duplication if there are other column families/tables that also
are identified or have a key of id? It seems that id potentially could be
duplicated for each column family/table. Is that just the way it is? While
for a small data set this would be no big deal. But for millions or billions
of orders this becomes significant. 

 

From: aaron morton [mailto:aaron@thelastpickle.com] 
Sent: Wednesday, November 14, 2012 6:00 PM
To: user@cassandra.apache.org
Subject: Re: CF metadata syntax for an array

 

database a hint that this will be an array?

Things are going to be easier if you stop thinking about arrays :)

 

For background
http://www.datastax.com/docs/1.1/dml/using_cql#using-composite-primary-keys

 

The first part of the primary key is the storage engine row key. This is the
ring that cassandra uses to identify the row in the cluster. You could think
of this as similar to a Hive partition. But if you have not used Hive I
would not think of it like that.  

 

The remaining parts are used to prefix the columns in the row. The
internally the columns created have the names 

 

* value of item_id : "price"

* value of item_id : "title" 

 

Hope that helps. 

 

-----------------

Aaron Morton

Freelance Cassandra Developer

New Zealand

 

@aaronmorton

http://www.thelastpickle.com

 

On 15/11/2012, at 11:35 AM, Peter Lin <woolfel@gmail.com> wrote:





it means the column family uses composite key, which gives you
additional capabilities like order by in the where clause

On Wed, Nov 14, 2012 at 5:27 PM, Kevin Burton <rkevinburton@charter.net>
wrote:



I hope I am not bugging you but now what is the purpose of PRIMARY_KEY(id,
item_id)? By expressing the KEY as two values this basically gives the
database a hint that this will be an array? Is there an implicit INDEX on id
and item_id? Thanks again.

-----Original Message-----
From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Wednesday, November 14, 2012 4:08 PM
To: user@cassandra.apache.org
Subject: Re: CF metadata syntax for an array

Like this?

cqlsh:dev> CREATE TABLE my_orders(
      ...     id int,
      ...     item_id int,
      ...     price decimal,
      ...     title text,
      ...     PRIMARY KEY(id, item_id)
      ... );

cqlsh:dev> insert into my_orders
      ...     (id, item_id, price, title)
      ... values
      ...     (1, 1, 4, 'this is an example');
cqlsh:dev> insert into my_orders
      ...     (id, item_id, price, title)
      ... values
      ...     (1, 2, 8, 'this is another example');
cqlsh:dev> insert into my_orders
      ...     (id, item_id, price, title)
      ... values
      ...     (1, 3, 6, 'this is the last example');

cqlsh:dev> select * from my_orders where id = 1;  id | item_id | price |
title
----+---------+-------+--------------------------
 1 |       1 |     4 |       this is an example
 1 |       2 |     8 |  this is another example
 1 |       3 |     6 | this is the last example




-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 15/11/2012, at 10:38 AM, Kevin Burton <rkevinburton@charter.net> wrote:




An array would be a list of groups of items. In my case I want a

list/array of line items. An order has certain characteristics and one of
them is a list of the items that are being ordered. Say ever line item has
an id, price, and description so one such "array" would look like:




1 $4.00 "This is an example"
2 $8.00 "This is another example"
3 $6.00 "This is the last example"

So in this case the array would be the three items listed above.  So the

"column" is repeated three times for this order.




From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Wednesday, November 14, 2012 3:05 PM
To: user@cassandra.apache.org
Subject: Re: CF metadata syntax for an array

In both cases the array is the PRIMARY_KEY.
I'm not sure what you mean by the "array"

The vector_name and list_name columns are used as "variable names" to

identify a particular vector or list. They are the storage engine "row key".




Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 14/11/2012, at 5:31 PM, Kevin Burton <rkevinburton@charter.net> wrote:


Does the  array have to be a KEY?
Sorry I don't understand this question.

In the samples you give you specify array as

                       CREATE COLUMNFAMILY Description (
                                                      PRIMARY_KEY

(vector_name, index),



                                                      Age text,
                                                      Gender text,
                                                     vector_name text,
                                                     index bigint,
...

Or
                       CREATE COLUMNFAMILY Description (
                                                      PRIMARY_KEY

(listr_name, sort_key),



                                                      Age text,
                                                      Gender text,
                                                      sort_key bigint,
                                                     list_name text
..


In both cases the array is the PRIMARY_KEY. In order for an array to work

does the array have to be a KEY?




From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Tuesday, November 13, 2012 10:18 PM
To: user@cassandra.apache.org
Subject: Re: CF metadata syntax for an array


Would this syntax be the same for CREATE COLUMNFAMILY (as an aside what is

a 'TABLE' in Cassandra)?



Yes, CQL 2 uses COLUMN FAMILY or Table and CQL 3 uses TABLE

http://www.datastax.com/dev/blog/cql3-evolutions

In other words is this valid:
Does it work ? Is so it's valid.

Does the  array have to be a KEY?
Sorry I don't understand this question.

Finally, what would be the syntax for inserting data into the CF?
Depends on what you want to do.
Docs are a good starting point
http://www.datastax.com/docs/1.1/references/cql/index

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/11/2012, at 2:42 AM, Kevin Burton <rkevinburton@charter.net> wrote:



Sorry to be so slow but I am just learning CQL. Would this syntax be the

same for CREATE COLUMNFAMILY (as an aside what is a 'TABLE' in Cassandra)?
In other words is this valid:




                       CREATE COLUMNFAMILY Description (
                                                      PRIMARY_KEY

(vector_name, index),



                                                      Age text,
                                                      Gender text,
                                                     vector_name text,
                                                     index bigint,
...

Or
                       CREATE COLUMNFAMILY Description (
                                                      PRIMARY_KEY

(listr_name, sort_key),



                                                      Age text,
                                                      Gender text,
                                                      sort_key bigint,
                                                     list_name text
..



Does the  array have to be a KEY? Finally, what would be the syntax for

inserting data into the CF?




Thanks again.

From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Tuesday, November 13, 2012 4:09 AM
To: user@cassandra.apache.org
Subject: Re: CF metadata syntax for an array

While this solves the problem for an array of 'primitive' types. What if I

want an array or collection of an arbitrary type like list<foo>, where foo
is a user defined type?



Do you mean a custom Cassandra data type that sub classes AbstractType? I

dont think CQL can support those, I may be wrong though.




If you mean a basic client side data type you could serialise it and store

as a string or byte buffer in a CQL collection.




What are the options to solve this type of array?
...



arbitrary type like list<foo>,
Do you mean an array such as int[] or do you mean the equivalent of a java

List<T> with functions like remove that actually delete items and from the
list?




If it's the former a CQL table such as below would work

CREATE TABLE vectors (
           vector_name text,
           index bigint,
           object_property_1 text,
           object_property_2 text,
           PRIMARY_KEY (vector_name, index) );

The problem is, if you delete a element at (vector, index) the remaining

indexes will be off.




If it's the later, a List<T>, then it depends a little on what
features you want to support. If you want a sorted list of objects the
table is roughly the same

CREATE TABLE List (
           list_name text,
           sort_key bigint,
           object_property_1 text,
           object_property_2 text,
           PRIMARY_KEY (list_name, sort_key) );

Hope that helps.

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/11/2012, at 9:46 AM, Kevin Burton <rkevinburton@charter.net> wrote:




While this solves the problem for an array of 'primitive' types. What if I

want an array or collection of an arbitrary type like list<foo>, where foo
is a user defined type? I am guessing that this cannot be done with
'collections'. What are the options to solve this type of array?




On Nov 12, 2012, at 2:28 PM, aaron morton <aaron@thelastpickle.com> wrote:

This may help http://www.datastax.com/dev/blog/cql3_collections

I have gotten as far as feeling a need to understand a 'super-column'
You can happily ignore them.


Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/11/2012, at 8:35 PM, Kevin Burton <rkevinburton@charter.net> wrote:




I am sorry if this is an FAQ. But I was wondering what the syntax for

describing an array? I have gotten as far as feeling a need to understand a
'super-column' but I fail after that. Once I have the metadata in place to
describe an array how do I  insert data into the array? Get data from the
array? Thank you.



 

 

 


Mime
View raw message