Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E8392D4D3 for ; Wed, 19 Sep 2012 20:33:31 +0000 (UTC) Received: (qmail 30239 invoked by uid 500); 19 Sep 2012 20:33:29 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 30217 invoked by uid 500); 19 Sep 2012 20:33:29 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 30208 invoked by uid 99); 19 Sep 2012 20:33:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Sep 2012 20:33:29 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS,T_FRT_BELOW2 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [192.174.58.134] (HELO XEDGEA.nrel.gov) (192.174.58.134) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Sep 2012 20:33:24 +0000 Received: from XHUBA.nrel.gov (10.20.4.58) by XEDGEA.nrel.gov (192.174.58.134) with Microsoft SMTP Server (TLS) id 8.3.245.1; Wed, 19 Sep 2012 14:32:50 -0600 Received: from MAILBOX2.nrel.gov ([fe80::19a0:6c19:6421:12f]) by XHUBA.nrel.gov ([::1]) with mapi; Wed, 19 Sep 2012 14:33:02 -0600 From: "Hiller, Dean" To: "user@cassandra.apache.org" Date: Wed, 19 Sep 2012 14:33:01 -0600 Subject: Re: Correct model Thread-Topic: Correct model Thread-Index: Ac2Wpfb4jm/DgYRUT4KTupWalng8hA== Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.3.120616 acceptlanguage: en-US Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Uhm, unless I am mistaken, a NEW request implies a new UUID so you can just= write it to both the index to the request row and to the user that request= was for all in one shot with no need to read, right? (Also, read before write is not necessarily bad=85it really depends on your= situation but in this case, I don't think you need read before write). For your structured data comment=85. Actually playOrm stores structured and unstructured data. It follows the p= attern cassandra is adopting more and more of "partial" schemas and plans t= o hold to that path. It is a complete break from JPA due to noSQL being so= different. and each request would have its own id, right Yes, in my design, I choose each request with it's own id. Wouldn't it be faster to have a composite key in the requestCF itself? In CQL, don't you have to have an =3D=3D in the first part of the clause me= aning you would have to select the user id, BUT you wanted requests > date = no matter which user so the indices I gave you have that information with a= simple column slice of the data. The indices I gave you look like this(co= mposite column names)=85. .., .., <= time3>.. NOTE that each is a UUID there in the <> so are uniq= ue. Maybe there is a way, but I am not sure on how to get all the latest reques= t > data for every user=85.I guess you could always map/reduce but that is = generally reserved for analytics or maybe updating new index tables you are= creating for reading faster. Later, Dean From: Marcelo Elias Del Valle > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, September 19, 2012 1:47 PM To: "user@cassandra.apache.org" > Subject: Re: Correct model 2012/9/19 Hiller, Dean > Thinking out loud and I think a bit towards playOrm's model though you don= =92t' need to use playroom for this. 1. I would probably have a User with the requests either embedded in or the= Foreign keys to the requests=85either is fine as long as you get the user = get ALL FK's and make one request to get the requests for that user This was my first option. However, everytime I have a new request I would n= eed to read the column "request_ids", update its value, and them write the = result. This would be a read-before-write, which is bad in Cassandra, right= ? Or you were talking about other kinds of FKs? 2. I would create rows for index and index each month of data OR maybe inde= x each day of data(depends on your system). Then, I can just query into th= e index for that one month. With playOrm S-SQL, this is a simple PARTITION= S r(:thismonthParititonId) SELECT r FROM Request r where r.date > :date OR = you just do a column range query doing the same thing into your index. The= index is basically the wide row pattern ;) with composite keys of .<= rowkey of request> I would consider playOrm in a later step in my project, as my understanding= now is it is good to store relational data, structured data. I cannot pred= ict which columns I am going to store in requestCF. But regardless, even in= Cassandra, you would still use a composite key, but it seems you would cre= ate an indexCf using the wide row pattern, and each request would have its = own id, right? But why? Wouldn't it be faster to have a composite key in th= e requestCF itself? From: Marcelo Elias Del Valle >> Reply-To: "user@cassandra.apache.org>" >> Date: Wednesday, September 19, 2012 1:02 PM To: "user@cassandra.apache.org>" >> Subject: Correct model I am new to Cassandra and NoSQL at all. I built my first model and any comments would be of great help. I am descri= bing my thoughts bellow. It's a very simple model. I will need to store several users and, for each = user, I will need to store several requests. It request has it's insertion = time. As the query comes first, here are the only queries I will need to ru= n against this model: - Select all the requests for an user - Select all the users which has new requests, since date D I created the following model: an UserCF, whose key is a userID generated b= y TimeUUID, and a RequestCF, whose key is composite: UserUUID + timestamp. = For each user, I will store basic data and, for each request, I will insert= a lot of columns. My questions: - Is the strategy of using a composite key good for this case? I thought in= other solutions, but this one seemed to be the best. Another solution woul= d be have a non-composite key of type UUID for the requests, and have anoth= er CF to relate user and request. - To perform the second query, instead of selecting if each user has a requ= est inserted after date D, I thought in storing the last request insertion = date into the userCF, everytime I have a new insert for the user. It would = be a data replication, but I would have no read-before-write and I am guess= ing the second query would perform faster. Any thoughts? -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr