cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "DataModel2" by MarkMcBride
Date Mon, 17 Aug 2009 22:21:28 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The following page has been changed by MarkMcBride:
http://wiki.apache.org/cassandra/DataModel2

------------------------------------------------------------------------------
  = Introduction =
  
- Cassandra has a data model that can most easily be thought of as a four or five dimensional
hash.  The basic concepts are a cluster, which can contain multiple keyspaces.  Each keyspace
can contain multiple column families.  Keyspaces contain multiple rows, which are referenced
by keys.  These rows contain multiple columns, each of which has a value and a timestamp.
 Super columns can be thought of as columns that have subcolumns.
+ Cassandra has a data model that can most easily be thought of as a four or five dimensional
hash.  The basic concepts are a cluster, which can contain multiple keyspaces.  Each keyspace
can contain multiple column families.  Keyspaces contain multiple rows, which are referenced
by keys.  These rows contain multiple columns, each of which has a value and a timestamp.
 Super columns can be thought of as columns that have subcolumns. We'll start from the bottom
up, moving from the leaves of Cassandra's data structure (columns) up to the root of the tree
(the cluster).
+ 
+ = Columns = 
+ 
+ The column is the lowest/smallest increment of data. It's a tuple (triplet) that contains
a name, a value and a timestamp.
+ 
+ Here's the thrift interface definition of a Column
+ 
+ struct Column {
+    1: binary                        name,
+    2: binary                        value,
+    3: i64                           timestamp,
+ }
+ 
+ And here's a column represented in JSON-ish notation:
+ 
+ {  // this is a column
+     name: "emailAddress",
+     value: "foo@bar.com",
+     timestamp: 123456789
+ }
+ 
+ 
+ All values are supplied by the client, including the timestamp.  This means that clocks
in the Cassandra environment should be synchronized, as these timestamps are used for conflict
resolution.  In many cases the timestamp is not used in client applications, and it becomes
convenient to think of a column as a name/value pair. For the remainder of this document,
timestamps will be elided for readability.  It is also worth noting the name and value are
binary values, although in many applications they are UTF8 serialized strings.
+ 
+ = Column Families =
+ A column family is a container for columns.  You define columns in your storage-conf.xml
file, and cannot modify them (or add new column families) without restarting your Cassandra
process.  A column family holds an ordered list of columns, which you can reference by the
column name.  A JSON representation would be
+ 
+ { Users : {
+   emailAddress : {  // this is a column
+     name: "emailAddress",
+     value: "foo@bar.com"
+   }
+ 
+   webSite : {  // this is a column
+     name: "webSite",
+     value: "http://bar.com"
+   }
+ }}
+ 
+ Where "Users" is the column family, and "emailAddress" and "webSite" are columns.
+ 
+ = Rows =
+ 
+ A row-oriented database stores rows in a row-major fashion (i.e. all the columns in the
row are kept together). A column-oriented database on the other hand stores data on a per-column
basis. Column Families allow a hybrid approach. They allow you to break your row (the data
corresponding to a key) into a static number of groups a.k.a Column Families. In Cassandra,
each Column Family is stored in a separate file, and the file is sorted in row (i.e. key)
major order. Related columns, those that you'll access together, should ideally be kept within
the same column family for access efficiency. Column families have a configurable ordering
applied to rows, which affects behavior of the get_key_range call in the thrift API.  Out
of the box ordering implementations include ASCII, UTF-8, Long, and UUID (lexical or time).
+ 
+ A JSON representation of the row -> column family -> column structure is
+ 
+ { mccv : {Users : {
+       emailAddress : {name: "emailAddress", value: "foo@bar.com"}
+       webSite : {  name: "webSite", value: "http://bar.com"}}
+     Stats : {
+       visits : {name: "visits", value: "243"}
+     }
+   }
+   user2 : {Users : {
+     emailAddress : {name: "emailAddress", value: "user2@bar.com"}
+     twitter : {  name: "twitter", value: "user2"}}
+   }
+ }
+ 
+ Note that the row mccv identifies data in two different column families (Users and Stats).
This does not imply that data from these column families *must* be related.  The semantics
of having data for the same key in two different column families is entirely up to the application.
 Also note that within the Users column family, mccv and user2 have different column names
defined.  This is perfectly valid in Cassandra.  In fact there may be a virtually unlimited
set of column names defined, which leads to fairly common use of the column name as a piece
of runtime populated data.  This is unusual in storage systems, particularly if you're coming
from the RDBMS world.
  
  = Keyspaces =
  
- A keyspace is the first dimension of the Cassandra hash, and is the container for column
families.  Almost all the Thrift API methods take a keyspace as the first argument, including
batch operations.
+ A keyspace is the first dimension of the Cassandra hash, and is the container for column
families. Keyspaces are roughly equivalent to a schema or database in the RDBMS world.  They
are the configuration and management point for column families, and is also the structure
on which batch inserts are applied.
  
- = Column Families and Columns =
+ = Super Columns =
  
- Basic unit of access control within Cassandra is a Column Family. A keyspace in Cassandra
is made up of one or many column families. A row in a keyspace is uniquely identified using
a unique key. The key is a string and can be of any size. The number of column families and
the name of each column family must currently be fixed at the time the cluster is started.
There is no limitation on the number of column families but it is expected that there would
be relatively few of these. A column family can be of one of two type: Simple or Super. Columns
within both of these are dynamically created and there is no limit on the number of these.
Columns are constructs that are uniquely identified by a name, a value and a user-defined
time stamp. The number of columns that can be contained in a column family could be very large.
This can also vary per key. For instance key K1 could have 1024 columns/supercolumns while
key K2 could have 64 columns/supercolumns. SuperColumns are constru
 cts that have a name and an infinite number of columns associated with them. The number of
supercolumns associated with any column family may be very large. They exhibit the same characteristics
as columns. The columns can be sorted by name or time and this can be explicitly expressed
via the configuration file, for any given column family.
+ So far we've covered "normal" column families.  Cassandra also supports super columns and
super column families.  A super column family is a column family whose members are super columns.
 A super column is just an associative array of columns.  Another way to think about this...
a super column is structurally very similar to a column family, and a super column family
is a column family that contains column families.  
  
- The main limitation on column and supercolumn size is that all data for a single key and
column must fit (on disk) on a single machine in the cluster.  Because keys alone are used
to determine the nodes responsible for replicating their data, the amount of data associated
with a single key has this upper bound.  This is an inherent limitation of the distribution
model.
+ A JSON description of this layout follows
  
- Currently Cassandra also has the limitation that in the worst case, data for a key / ColumnFamily
pair will all be deserialized into memory for a read request.  (But never for writes!)  This
will be fixed in a future release.
+ { mccv : {
+     Tags : {
+         cassandra : {
+             incubator : { incubator : "http://incubator.apache.org/cassandra/"},
+             jira : { jira : "http://issues.apache.org/jira/browse/CASSANDRA"}
+         },
+         thrift : {
+             jira : { jira : "http://issues.apache.org/jira/browse/THRIFT"}
+         }
+     }  
+ }
  
- = Rows =
+ Here my super column family is "Tags".  I have two super columns defined here, "cassandra"
and "thrift".  Within these I have specific named bookmarks, each of which is a column.
  
- A row-oriented database stores rows in a row-major fashion (i.e. all the columns in the
row are kept together). A column-oriented database on the other hand stores data on a per-column
basis. Column Families allow a hybrid approach. They allow you to break your row (the data
corresponding to a key) into a static number of groups a.k.a Column Families. In Cassandra,
each Column Family in a table is stored in a separate file, and the file is sorted in row
(i.e. key) major order. Related columns, those that you'll access together, should ideally
be kept within the same column family for access efficiency. Furthermore, columns in a column
family can be sorted and stored on disk either in time sorted order or in name sorted order.
SuperColumns, on the other hand, are always sorted by name. Columns within a super column
may be sorted by time.
+ == Example: SuperColumns for Search Apps ==
+ 
+ You can think of each supercolumn name as a term and the columns within as the docids with
rank info and other attributes being a part of it. If you have keys as the userids then you
can have a per-user index stored in this form. This is how the per user index for term search
is laid out for Inbox search at Facebook. Furthermore since one has the option of storing
data on disk sorted by "Time" it is very easy for the system to answer queries of the form
"Give me the top 10 messages". For a pictorial explanation please refer to the Cassandra powerpoint
slides presented at SIGMOD 2008.
+ 
  
  = Data Addressing =
  
+ The Thrift API introduces the notion of column paths and column parents.  These normalize
to both super and normal super column families.  Conceptually a column parent always refers
to a set of columns.  A column path always refers to a single column.  Thrift definitions
for these structures are
- The Thrift API introduces the notion of column paths and column parents.
- Suppose we define a table called !MyTable with column families !MySuperColumnFamily (this
a column family of type Super) and !MyColumnFamily (this is simple column family). Any super
column, SC in the !MySuperColumnFamily is addressed with the  "!MySuperColumnFamily:SC" and
any column "C" within "SC" is addressed as !MySuperColumnFamily:SC:C. Any column C within
!MySimpleColumnFamily is addressed as "!MySimpleColumnFamily:C". In short ":" is reserved
word and should not be used as part of a Column Family name or as part of the name for a Super
Column or Column.  (We plan to address this limitation for the 0.4 release.)
  
+ struct ColumnParent {
+     3: string          column_family,
+     4: optional binary super_column,
+ }
+ 
+ struct ColumnPath {
+     3: string          column_family,
+     4: optional binary super_column,
+     5: optional binary column,
+ }
+ 
+ Suppose we define a table called !MyTable with column families !MySuperColumnFamily (this
a column family of type Super) and !MyColumnFamily (this is a simple column family). Any super
column, SC in the !MySuperColumnFamily is addressed with the  "!MySuperColumnFamily:SC" and
any column "C" within "SC" is addressed as 
+ 
+ new ColumnPath("!MySuperColumnFamily","SC","C")
+ 
+ Any column C within !MySimpleColumnFamily is addressed as 
+ 
+ new ColumnPath("!MySimpleColumnFamily",null,"C")
+ 
+ = Slice queries =
+ == Slice Predicates ==
+ == ColumnOrSuperColumn ==
  = Range queries =
  
  Cassandra supports pluggable partitioning schemes with a relatively small amount of code.
 Out of the box, Cassandra provides the hash-based RandomPartitioner and an OrderPreservingPartitioner.
 RandomPartitioner gives you pretty good load balancing with no further work required.  OrderPreservingPartitioner
on the other hand lets you perform range queries on the keys you have stored.  Systems that
only support hash-based partitioning cannot perform range queries efficiently.
  
- = Example: SuperColumns for Search Apps =
+ = Consistency Level =
  
- You can think of each supercolumn name as a term and the columns within as the docids with
rank info and other attributes being a part of it. If you have keys as the userids then you
can have a per-user index stored in this form. This is how the per user index for term search
is laid out for Inbox search at Facebook. Furthermore since one has the option of storing
data on disk sorted by "Time" it is very easy for the system to answer queries of the form
"Give me the top 10 messages". For a pictorial explanation please refer to the Cassandra powerpoint
slides presented at SIGMOD 2008.
+ = Batch Mutation =
  

Mime
View raw message