incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Hillsborough <matthew.hillsboro...@gmail.com>
Subject Re: Using CQL to insert a column to a row dynamically
Date Mon, 27 May 2013 21:38:59 GMT
Thanks for the reply so far. If I can describe my data model, maybe it'll
be easier to see why I wanted to have "wide rows". Maybe there's a better
way to model this data using Cassandra or perhaps Cassandra isn't the right
tool.

Here's an example. Let's say I have a database (relational for argument
sake) that contains an `events` table, `rides` table and a `users` table.
An event can has multiple occurrences. Imagine a carnival that's in town
for 5 days. In the carnival, there are many rides that "users" can ride. I
needed to be able to "at scale" to solve questions such as:

* On the second day of event N, how many `users` rode these `rides`.

They above can obviously be done in a relational db, but I was researching
on possibly seeing if Cassandra can do it better.

Originally what I thought of doing was creating a column family in
Cassandra named `ride_events`. Each row key would be a rideID that's simply
an integer. I would then arbitrarily create columns with a name of the
following format:

"EventID_5/Day_2/User_6" with a value of null.

I was under the impression in Cassandra you can do a query then such as
(using CQL?):

SELECT "EventID_5/Day_2".."EventID_5/Day_2~" FROM ride_events WHERE key IN
([array of ride IDs]);

Ultimately my client will be responsible for splitting the data up and
making sense of it, but the retrieval/writing would be done in the
datastore and I'd be able to store many, many events for many rides and
grow it out horizontally.

Does that make sense at all? Is there a better way to model this data in
Cassandra and/or query for it?


On Mon, May 27, 2013 at 3:41 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> On Mon, May 27, 2013 at 9:28 AM, Matthew Hillsborough
> <matthew.hillsborough@gmail.com> wrote:
> > I am trying to understand some fundamentals in Cassandra, I was under the
> > impression that one of the advantages a developer can take in designing a
> > data model is by dynamically adding columns to a row identified by a key.
> > That means I can model my data so that if it makes sense, a key can be
> > something such as a user_id from a relational database, and I can for
> > example, create arbitrary amounts of columns that relate to that user.
>
> Fundamentally?  No.  Experience has shown that having schema to say
> "email column is text, and birth date column is a timestamp" is very
> useful as projects and teams grow.
>
> That said, if you really don't know what kinds of attributes might
> apply (generally because they are user-generated) you can use a Map.
>
> > Wouldn't this type of model make more sense to just stuff into a
> relational
> > database?
>
> There's nothing wrong with the relational model per se (subject to the
> usual explanation about needing to denormalize to scale).  Cassandra
> is about making applications scale, not throwing the SQL baby out with
> the bathwater for the sake of being different.
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced
>

Mime
View raw message