incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paulo Motta <pauloricard...@gmail.com>
Subject Re: Complex JSON objects
Date Wed, 11 Sep 2013 21:10:10 GMT
What you can do to store a complex json object in a C* skinny row is to
serialize each field independently as a Json String and store each field as
a C* column within the same row (representing a JSON object).

So using the example you mentioned, you could store it in cassandra as:

ColumnFamily["objectKey"]["readings"] = "[{reading1}, {reading2},
{reading3}]"
ColumnFamily["objectKey"]["events"] = "[{event1}, {event2}, {event3}]"

But in fact, that isn't an optimal way to store such data in cassandra,
since you would need to de-serialize all the readings if you were
interested in a particular reading or time period.

A better way to store time series data is to store one measurement/event
per column, so you're able to retrieve data for a particular time period
more easily (since columns are stored in sorted order). One way to do that
for your data would be to store them in 2 column families, as in:

Reading["objectKey"]["timestamp3"] = "{reading3}"

Reading["objectKey"]["timestamp2"] = "{reading2}"

Reading["objectKey"]["timestamp1"] = "{reading1}"

Event["objectKey"]["timestamp3"] = "{event3}"

Event["objectKey"]["timestamp2"] = "{event2}"

Event["objectKey"]["timestamp1"] = "{event1}"


So you're able to reconstruct the original JSON "objectKey" by fetching the
columns from Reading["objectKey"] and Event["objectKey"], and you're also
able to efficiently query all readings between timestamp2 and timestamp3
that ocurred inside the json object, if necessary.


In this post you can find more information on how to store time series data
in C* in an efficient way:
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra


2013/9/11 Edward Capriolo <edlinuxguru@gmail.com>

> I was playing a while back with the concept of storing JSON into cassandra
> columns in a sortable way.
>
> Warning: This is kinda just a cool idea, I never productionized it.
> https://github.com/edwardcapriolo/Cassandra-AnyType
>
>
>
> On Wed, Sep 11, 2013 at 2:26 PM, Hartzman, Leslie <
> leslie.d.hartzman@medtronic.com> wrote:
>
>>  Hi,****
>>
>> ** **
>>
>> What would be the recommended way to deal with a complex JSON structure,
>> short of storing the whole JSON as a value to a column? What options are
>> there to store dynamic data like this?****
>>
>> ** **
>>
>> e.g.,****
>>
>> ** **
>>
>> {****
>>
>>   “ readings”: [****
>>
>>                 {****
>>
>>                        “value” : 20,****
>>
>>                       “rate_of_change” : 0.05,****
>>
>>                       “timestamp” :  1378686742465****
>>
>>                  },****
>>
>>                 {****
>>
>>                        “value” : 22,****
>>
>>                       “rate_of_change” : 0.05,****
>>
>>                       “timestamp” :  1378686742466****
>>
>>                  },****
>>
>>                 {****
>>
>>                        “value” : 21,****
>>
>>                       “rate_of_change” : 0.05,****
>>
>>                       “timestamp” :  1378686742467****
>>
>>                  }****
>>
>>   ],****
>>
>>   “events” : [****
>>
>>              {****
>>
>>                     “type” : “direction_change”,****
>>
>>                     “version” : 0.1,****
>>
>>                     “timestamp”: 1378686742465****
>>
>>                      “data” : {****
>>
>>                                           “units” : “miles”,****
>>
>>                                           “direction” : “NW”,****
>>
>>                                           “offset” : 23****
>>
>>                                       }****
>>
>>                },****
>>
>>              {****
>>
>>                     “type” : “altitude_change”,****
>>
>>                     “version” : 0.1,****
>>
>>                     “timestamp”: 1378686742465****
>>
>>                      “data” : {****
>>
>>                                           “rate”: 0.2,****
>>
>>                                           “duration” : 18923****
>>
>>                                       }****
>>
>>                 }****
>>
>>    ]****
>>
>> }****
>>
>> ** **
>>
>>                  ****
>>
>> [CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this
>> email is proprietary to Medtronic and is intended for use only by the
>> individual or entity to which it is addressed, and may contain information
>> that is private, privileged, confidential or exempt from disclosure under
>> applicable law. If you are not the intended recipient or it appears that
>> this mail has been forwarded to you without proper authority, you are
>> notified that any use or dissemination of this information in any manner is
>> strictly prohibited. In such cases, please delete this mail from your
>> records. To view this notice in other languages you can either select the
>> following link or manually copy and paste the link into the address bar of
>> a web browser: http://emaildisclaimer.medtronic.com
>>
>
>


-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto Superior Técnico - IST*
*http://paulormg.com*

Mime
View raw message