incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Haddad <...@jonhaddad.com>
Subject Re: Wide rows/composite keys clarification needed
Date Mon, 21 Oct 2013 23:45:22 GMT
If you're working with CQL, you don't need to worry about the column names, it's handled for
you.

If you specify multiple keys as part of the primary key, they become clustering keys and are
mapped to the column names.  So if you have a sensor_id / time_stamp, all your sensor readings
will be in the same row in the traditional cassandra sense, sorted by your time_stamp.

On Oct 21, 2013, at 4:27 PM, Les Hartzman <lhartzman@gmail.com> wrote:

> So looking at Patrick McFadin's data modeling videos I now know about using compound
keys as a way of partitioning data on a by-day basis.
> 
> My other questions probably go more to the storage engine itself. How do you refer to
the columns in the wide row? What kind of names are assigned to the columns?
> 
> Les
> 
> On Oct 20, 2013 9:34 PM, "Les Hartzman" <lhartzman@gmail.com> wrote:
> Please correct me if I'm not describing this correctly. But if I am collecting sensor
data and have a table defined as follows:
> 
>          create table sensor_data (
>                sensor_id int,
>                time_stamp int,  // time to the hour granularity
>                voltage float,
>                amp float,
>                PRIMARY KEY (sensor_id, time_stamp) ));
> 
> The partitioning value is the sensor_id and the rest of the PK components become part
of the column name for the additional fields, in this case voltage and amp.
> 
> What goes into determining what additional data is inserted into this row? The first
time an insert takes place there will be one entry for all of the fields. Is there anything
besides the sensor_id that is used to determine that the subsequent insertions for that sensor
will go into the same row as opposed to starting a new row?
> 
> Base on something I read (but can't currently find again), I thought that as long as
all of the elements of the PK remain the same (same sensor_id and still within the same hour
as the first reading), that the next insertion would be tacked onto the end of the first row.
Is this correct?
> 
> For subsequent entries into the same row for additional voltage/amp readings, what are
the names of the columns for these readings? My understanding is that the column name becomes
a concatenation of the non-row key field names plus the data field names.So if the first go-around
you have <time_stamp>:<voltage> and <time_stamp>:<amp>, what do the
subsequent column names become? 
> 
> Thanks.
> 
> Les
> 


Mime
View raw message