Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2A4CC10643 for ; Wed, 11 Sep 2013 21:10:58 +0000 (UTC) Received: (qmail 93024 invoked by uid 500); 11 Sep 2013 21:10:55 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 93000 invoked by uid 500); 11 Sep 2013 21:10:55 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 92992 invoked by uid 99); 11 Sep 2013 21:10:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Sep 2013 21:10:55 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pauloricardomg@gmail.com designates 209.85.192.178 as permitted sender) Received: from [209.85.192.178] (HELO mail-pd0-f178.google.com) (209.85.192.178) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Sep 2013 21:10:50 +0000 Received: by mail-pd0-f178.google.com with SMTP id w10so9680687pde.23 for ; Wed, 11 Sep 2013 14:10:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=XCeOfeQmP22tH5PA5fel7krpRND0cqqmYapTPvtl8E4=; b=ppoENHkQgfUIBDgTJchCRIwMMpv2auezGqkI//DLMMa82++wkXwRMyir0j+x2hT3D8 cr4UzLwYB1zuUgbrdpCipYsroV9Jz/gLJs1dm4fTWVfkzUDKoaZ49Av8hY0dCIoazKuH WiL0a7Lo4Wxx6fzAKsYhEFyE2PY0dcH6yWJxlFfI3a9Ya00g0VxcN+aVlpx9uSbA7Xby P8lqoJrud3io/mkOVbYxHvPVviywCKX54Nliskr0kvx6ctah2vc3r64CCmQl5+CjEKGX W1DnHpgGm9drkqyBMxGouPmjfelhdI2E7b5viwv9Z80Hzm+R4bmd9fCsHif3FhXtaqgb q2rg== X-Received: by 10.66.14.3 with SMTP id l3mr5925198pac.162.1378933830694; Wed, 11 Sep 2013 14:10:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.70.21.129 with HTTP; Wed, 11 Sep 2013 14:10:10 -0700 (PDT) In-Reply-To: References: From: Paulo Motta Date: Wed, 11 Sep 2013 18:10:10 -0300 Message-ID: Subject: Re: Complex JSON objects To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=bcaec520f4e525bcd604e6220d35 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec520f4e525bcd604e6220d35 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable What you can do to store a complex json object in a C* skinny row is to serialize each field independently as a Json String and store each field as a C* column within the same row (representing a JSON object). So using the example you mentioned, you could store it in cassandra as: ColumnFamily["objectKey"]["readings"] =3D "[{reading1}, {reading2}, {reading3}]" ColumnFamily["objectKey"]["events"] =3D "[{event1}, {event2}, {event3}]" But in fact, that isn't an optimal way to store such data in cassandra, since you would need to de-serialize all the readings if you were interested in a particular reading or time period. A better way to store time series data is to store one measurement/event per column, so you're able to retrieve data for a particular time period more easily (since columns are stored in sorted order). One way to do that for your data would be to store them in 2 column families, as in: Reading["objectKey"]["timestamp3"] =3D "{reading3}" Reading["objectKey"]["timestamp2"] =3D "{reading2}" Reading["objectKey"]["timestamp1"] =3D "{reading1}" Event["objectKey"]["timestamp3"] =3D "{event3}" Event["objectKey"]["timestamp2"] =3D "{event2}" Event["objectKey"]["timestamp1"] =3D "{event1}" So you're able to reconstruct the original JSON "objectKey" by fetching the columns from Reading["objectKey"] and Event["objectKey"], and you're also able to efficiently query all readings between timestamp2 and timestamp3 that ocurred inside the json object, if necessary. In this post you can find more information on how to store time series data in C* in an efficient way: http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra 2013/9/11 Edward Capriolo > I was playing a while back with the concept of storing JSON into cassandr= a > columns in a sortable way. > > Warning: This is kinda just a cool idea, I never productionized it. > https://github.com/edwardcapriolo/Cassandra-AnyType > > > > On Wed, Sep 11, 2013 at 2:26 PM, Hartzman, Leslie < > leslie.d.hartzman@medtronic.com> wrote: > >> Hi,**** >> >> ** ** >> >> What would be the recommended way to deal with a complex JSON structure, >> short of storing the whole JSON as a value to a column? What options are >> there to store dynamic data like this?**** >> >> ** ** >> >> e.g.,**** >> >> ** ** >> >> {**** >> >> =93 readings=94: [**** >> >> {**** >> >> =93value=94 : 20,**** >> >> =93rate_of_change=94 : 0.05,**** >> >> =93timestamp=94 : 1378686742465**** >> >> },**** >> >> {**** >> >> =93value=94 : 22,**** >> >> =93rate_of_change=94 : 0.05,**** >> >> =93timestamp=94 : 1378686742466**** >> >> },**** >> >> {**** >> >> =93value=94 : 21,**** >> >> =93rate_of_change=94 : 0.05,**** >> >> =93timestamp=94 : 1378686742467**** >> >> }**** >> >> ],**** >> >> =93events=94 : [**** >> >> {**** >> >> =93type=94 : =93direction_change=94,**** >> >> =93version=94 : 0.1,**** >> >> =93timestamp=94: 1378686742465**** >> >> =93data=94 : {**** >> >> =93units=94 : =93miles=94,**** >> >> =93direction=94 : =93NW=94,***= * >> >> =93offset=94 : 23**** >> >> }**** >> >> },**** >> >> {**** >> >> =93type=94 : =93altitude_change=94,**** >> >> =93version=94 : 0.1,**** >> >> =93timestamp=94: 1378686742465**** >> >> =93data=94 : {**** >> >> =93rate=94: 0.2,**** >> >> =93duration=94 : 18923**** >> >> }**** >> >> }**** >> >> ]**** >> >> }**** >> >> ** ** >> >> **** >> >> [CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this >> email is proprietary to Medtronic and is intended for use only by the >> individual or entity to which it is addressed, and may contain informati= on >> that is private, privileged, confidential or exempt from disclosure unde= r >> applicable law. If you are not the intended recipient or it appears that >> this mail has been forwarded to you without proper authority, you are >> notified that any use or dissemination of this information in any manner= is >> strictly prohibited. In such cases, please delete this mail from your >> records. To view this notice in other languages you can either select th= e >> following link or manually copy and paste the link into the address bar = of >> a web browser: http://emaildisclaimer.medtronic.com >> > > --=20 Paulo Ricardo --=20 European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior T=E9cnico - IST* *http://paulormg.com* --bcaec520f4e525bcd604e6220d35 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
What you can do to store a complex json object in a C* ski= nny row is to serialize each field independently as a Json String and store= each field as a=A0C*=A0column within the same row (representing a JSON obj= ect).

So using the example you mentioned, you could store it in ca= ssandra as:

ColumnFamily["objectKey"][&q= uot;readings"] =3D "[{reading1}, {reading2}, {reading3}]"
ColumnFamily["objectKey"]["events"] =3D "[{ev= ent1}, {event2}, {event3}]"

But in fact, = that isn't an optimal way to store such data in cassandra, since you wo= uld need to de-serialize all the readings if you were interested in a parti= cular reading or time period.

A better way to store time series data is to store one = measurement/event per column, so you're able to retrieve data for a par= ticular time period more easily (since columns are stored in sorted order).= One way to do that for your data would be to store them in 2 column famili= es, as in:

Reading["objectKey"]["timestamp3"] =3D "{reading3}"

Reading["objectKey"][= "timestamp2"] =3D "{reading2}"

Reading["objectKey"][= "timestamp1"] =3D "{reading1}"

Event["obj= ectKey"][&quo= t;timestamp3"] =3D "{event3}"

Event["<= /span>objectKey"]["timestamp2&quo= t;] =3D "{event2}"

Event["<= /span>objectKey"]["timestamp1&quo= t;] =3D "{event1}"


So you're able to rec= onstruct the original JSON "objectKey" by fetching the columns fr= om Reading["objectKey"] and Event["objectKey"], and you= 're also able to efficiently query all readings between timestamp2 and = timestamp3 that ocurred inside the json object, if necessary.


In this post you can find= more information on how to store time series data in C* in an efficient wa= y:=A0http://www.da= tastax.com/dev/blog/advanced-time-series-with-cassandra



2013/9/11 Edward Capriolo <edlinuxguru@gmail.com>
I was playing a while = back with the concept of storing JSON into cassandra columns in a sortable = way.

Warning: This is kinda just a cool idea, I never = productionized it.



On Wed, Sep 11, 2013 at 2:26 PM, Hartzma= n, Leslie <leslie.d.hartzman@medtronic.com> wr= ote:

Hi,

=A0

What would be the recommended way to deal with a com= plex JSON structure, short of storing the whole JSON as a value to a column= ? What options are there to store dynamic data like this?

=A0

e.g.,

=A0

{

=A0 =93 readings=94: [

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 {

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0 =93value=94 : 20,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 =93rate_of_change=94 : 0.05,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 =93timestamp=94 :=A0 1378686742465

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 },<= u>

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 {

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0 =93value=94 : 22,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 =93rate_of_change=94 : 0.05,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 =93timestamp=94 :=A0 1378686742466

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 },<= u>

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 {

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0 =93value=94 : 21,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 =93rate_of_change=94 : 0.05,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 =93timestamp=94 :=A0 1378686742467

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 }

=A0 ],

=A0 =93events=94 : [

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 {=

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 =93type=94 : =93direction_change=94,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 =93version=94 : 0.1,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 =93timestamp=94: 1378686742465

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 =93data=94 : {

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = =93units=94 : =93miles=94,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = =93direction=94 : =93NW=94,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = =93offset=94 : 23

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =A0}

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 },=

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 {=

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 =93type=94 : =93altitude_change=94,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 =93version=94 : 0.1,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 =93timestamp=94: 1378686742465

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 =93data=94 : {

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = =93rate=94: 0.2,

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = =93duration=94 : 18923

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =A0}

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 }

=A0=A0 ]

}

=A0

=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =

[CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this email is proprietary to Medtronic and is in= tended for use only by the individual or entity to which it is addressed, a= nd may contain information that is private, privileged, confidential or exe= mpt from disclosure under applicable law. If you are not the intended recip= ient or it appears that this mail has been forwarded to you without proper = authority, you are notified that any use or dissemination of this informati= on in any manner is strictly prohibited. In such cases, please delete this = mail from your records. =20 To view this notice in other languages you can either select the following = link or manually copy and paste the link into the address bar of a web brow= ser: htt= p://emaildisclaimer.medtronic.com





--
=
Paulo Ricardo

--
European Master in Dist= ributed Computing
Royal Institute of Technology -=A0KTH
Instituto= Superior T=E9cnico - IST
--bcaec520f4e525bcd604e6220d35--