Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A4E1AD5DA for ; Tue, 25 Sep 2012 14:53:37 +0000 (UTC) Received: (qmail 614 invoked by uid 500); 25 Sep 2012 14:53:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 582 invoked by uid 500); 25 Sep 2012 14:53:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 573 invoked by uid 99); 25 Sep 2012 14:53:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Sep 2012 14:53:35 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [192.174.58.133] (HELO XEDGEB.nrel.gov) (192.174.58.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Sep 2012 14:53:28 +0000 Received: from XHUBB.nrel.gov (10.20.4.59) by XEDGEB.nrel.gov (192.174.58.133) with Microsoft SMTP Server (TLS) id 8.3.245.1; Tue, 25 Sep 2012 08:53:04 -0600 Received: from MAILBOX2.nrel.gov ([fe80::19a0:6c19:6421:12f]) by XHUBB.nrel.gov ([::1]) with mapi; Tue, 25 Sep 2012 08:53:06 -0600 From: "Hiller, Dean" To: "user@cassandra.apache.org" Date: Tue, 25 Sep 2012 08:53:09 -0600 Subject: Re: Correct model Thread-Topic: Correct model Thread-Index: Ac2bLXgd5GGqUBvvSRGOapt6I6KttQ== Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.3.120616 acceptlanguage: en-US Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Just fyi that some of these are cassandra questions=85 Dean, In the playOrm data modeling, if I understood it correctly, every CF ha= s its own id, right? No, each entity has a field annotated with @NoSqlId. That tells playOrm th= is is the row key. Each INSTANCE of the entity is a row in cassandra (very= much like hibernate for RDBMS). So every instance of Activity has a diffe= rent NoSqlId (NOTE: ids are auto generated so you don't need to deal with i= t though you can set it manually if you like) For instance, User would have its own ID, Activities would have its own id,= etc. User has a field private String id; annotated with @NoSqlId so each INSTANC= E of User has it's own id and each INSTANCE of Activity has it's own id. What if I have a trillion activities? This is fine and is a normal cassandra use-case. In fact, this is highly d= esirable in nosql stores and retrieving by key is desired when possible. Wouldn't be a problem to have 1 row id for each activity? Nope, no problems. Cassandra always indexes by row id, right? If you do CQL and cassandra partitioning/indexing, then yes, BUT if you do = PlayOrm partitioning, then NO. PlayOrm indexes your columns and there is O= NE index for EACH partition so if you have 1 trillion rows and 1 billion pa= rtitions, then each index on average is 1000 rows only so you can do a quic= k query into an index that only has 1000 values. If I have too many row ids without using composite keys, will it scale the = same way? Yes, partitions is the key though=85.you must decide your partitioning so t= hat partitions(or I could say indices) do not have a very high row count. = I currently maintain less than 1 million but I would say it slows down some= where in the millions of rows per partition(ie. You can get pretty big but = smaller can be better). Wouldn't the time to insert an activity be each time longer because I have = too many activities? Nope, this is a cassandra question really and cassandra is optimized as all= noSQL stores are to put and read value by key. They all work best that wa= y. Behind the scenes there is a meta table that PlayOrm writes to(one row per = java class you create that is annotated with @NoSqlEntity) and that is used= to drive the ad-hoc tool so you can query into cassandra and not get hex o= ut, but get the real values and see them. Best regards, Marcelo Valle. 2012/9/25 Hiller, Dean > If you need anything added/fixed, just let PlayOrm know. PlayOrm has been = able to quickly add so far=85that may change as more and more requests come= but so far PlayOrm seems to have managed to keep up. We are using it live by the way already. It works out very well so far for= us (We have 5000 column families, obviously dynamically created instead of= by hand=85a very interesting use case of cassandra). In our live environm= ent we configured astyanax with LocalQUOROM on reads AND writes so CP style= so we can afford one node out of 3 to go down but if two go down it stops = working THOUGH there is a patch in astyanax to auto switch from LocalQUOROM= to ONE NODE read/write when two nodes go down that we would like to suck i= n eventually so it is always live(I don't think Hector has that and it is a= really NICE feature=85.ie fail localquorm read/write and then try again wi= th consistency level of one). Later, Dean From: Marcelo Elias Del Valle >> Reply-To: "user@cassandra.apache.org>" >> Date: Monday, September 24, 2012 1:54 PM To: "user@cassandra.apache.org>" >> Subject: Re: Correct model Dean, this sounds like magic :D I don't know details about the performance on the index implementations you= chose, but it would pay the way to use it in my case, as I don't need the = best performance in the world when reading, but I need to assure scalabilit= y and have a simple model to maintain. I liked the playOrm concept regardin= g this. I have more doubts, but I will ask them at stack over flow from now on. 2012/9/24 Hiller, Dean >> PlayOrm will automatically create a CF to index my CF? It creates 3 CF's for all indices, IntegerIndice, DecimalIndice, and String= Indice such that the ad-hoc tool that is in development can display the ind= ices as it knows the prefix of the composite column name is of Integer, Dec= imal or String and it knows the postfix type as well so it can translate ba= ck from bytes to the types and properly display in a GUI (i.e. On top of SE= LECT, the ad-hoc tool is adding a way to view the induce rows so you can ch= eck if they got corrupt or not). Will it auto-manage it, like Cassandra's secondary indexes? YES Further detail=85 You annotated fields with @NoSqlIndexed and PlayOrm adds/removes from the i= ndex as you add/modify/remove the entity=85..a modify does a remove old val= from index and insert new value into index. An example would be PlayOrm stores all long, int, short, byte in a type tha= t uses the least amount of space so IF you have a long OR BigInteger betwee= n =96128 to 128 it only ends up storing 1 byte in cassandra(SAVING tons of = space!!!). Then if you are indexing a type that is one of those, PlayOrm c= reates a IntegerIndice table. Right now, another guy is working on playorm-server which is a webgui to al= low ad-hoc access to all your data as well so you can ad-hoc queries to see= data and instead of showing Hex, it shows the real values by translating t= he bytes to String for the schema portions that it is aware of that is. Later, Dean From: Marcelo Elias Del Valle >>>> Reply-To: "user@cassandra.apache.org>>>" >>>> Date: Monday, September 24, 2012 12:09 PM To: "user@cassandra.apache.org>>>" >>>> Subject: Re: Correct model Dean, There is one last thing I would like to ask about playOrm by this list,= the next questiosn will come by stackOverflow. Just because of the context= , I prefer asking this here: When you say playOrm indexes a table (which would be a CF behind the s= cenes), what do you mean? PlayOrm will automatically create a CF to index m= y CF? Will it auto-manage it, like Cassandra's secondary indexes? In Cassandra, the application is responsible for maintaining the index= , right? I might be wrong, but unless I am using secondary indexes I need t= o update index values manually, right? I got confused when you said "PlayOrm indexes the columns you choose".= How do I choose and what exactly it means? Best regards, Marcelo Valle. 2012/9/24 Hiller, Dean >>>> Oh, ok, you were talking about the wide row pattern, right? yes But playORM is compatible with Aaron's model, isn't it? Not yet, PlayOrm supports partitioning one table multiple ways as it indexe= s the columns(in your case, the userid FK column and the time column) Can I map exactly this using playORM? Not yet, but the plan is to map these typical Cassandra scenarios as well. Can I ask playOrm questions in this list? The best place to ask PlayOrm questions is on stack overflow and tag with P= layOrm though I monitor this list and stack overflow for questions(there ar= e already a few questions on stack overflow). The examples directory is empty for now, I would like to see how to set up = the connection with it. Running build or build.bat is always kept working and all 62 tests pass(or = we don't merge to master) so to see how to make a connection or run an exam= ple 1. Run build.bat or build which generates parsing code 2. Import into eclipse (it already has .classpath and .project for you al= ready there) 3. In FactorySingleton.java you can modify IN_MEMORY to CASSANDRA or not = and run any of the tests in-memory or against localhost(We run the test sui= te also against a 6 node cluster as well and all passes) 4. FactorySingleton probably has the code you are looking for plus you ne= ed a class called nosql.Persistence or it won't scan your jar file.(class f= ile not xml file like JPA) Do you mean I need to load all the keys in memory to do a multi get? No, you batch. I am not sure about CQL, but PlayOrm returns a Cursor not t= he results so you can loop through every key and behind the scenes it is do= ing batch requests so you can load up 100 keys and make one multi get reque= st for those 100 keys and then can load up the next 100 keys, etc. etc. etc= . I need to look more into the apis and protocol of CQL to see if it allow= s this style of batching. PlayOrm does support this style of batching toda= y. Aaron would know if CQL does. Why did you move? Hector is being considered for being the "official" clien= t for Cassandra, isn't it? At the time, I wanted the file streaming feature. Also, Hector seemed a bi= t cumbersome as well compared to astyanax or at least if you were building = a platform and had no use for typing the columns. Just personal preference= really here. I am not sure I understood this part. If I need to refactor, having the par= tition id in the key would be a bad thing? What would be the alternative? I= n my case, as I use userId : partitionId as row key, this might be a proble= m, right? PlayOrm indexes the columns you choose(ie. The ones you want to use in the = where clause) and partitions by columns you choose not based on the key so = in PlayOrm, the key is typically a TimeUUID or something cluster unique=85.= .any tables referencing that TimeUUID never have to change. With Cassandra= partitioning, if you repartition that table a different way or go for some= kind of major change(usually done with map/reduce), all your foreign keys = "may" have to change=85.it really depends on the situation though. Maybe y= ou get the design right and never have to change. @NoSqlQuery(name=3D"findWithJoinQuery", query=3D"PARTITIONS t(:partId) SELE= CT t FROM TABLE as t "+ "INNER JOIN t.activityTypeInfo as i WHERE i.type =3D :type and t.numShares = < :shares"), What would happen behind the scenes when I execute this query? In this case, t or TABLE is a partitioned table since a partition is define= d. And t.activityTypeInfo refers to the ActivityTypeInfo table which is no= t partitioned(AND ActivityTypeInfo won't scale to billions of rows because = there is no partitioning but maybe you don't need it!!!). Behind the scene= s when you call getResult, it returns a cursor that has NOT done anything y= et. When you start looping through the cursor, behind the scenes it is bat= ching requests asking for next 500 matches(configurable) so you never run o= ut of memory=85.it is EXACTLY like a database cursor. You can even use the= cursor to show a user the first set of results and when user clicks next p= ick up right where the cursor left off (if you saved it to the HttpSession)= . You can only use joins with partition keys, right? Nope, joins work on anything. You only need to specify the partitionId whe= n you have a partitioned table in the list of join tables. (That is what th= e PARTITIONS clause is for, to identify partitionId =3D what?)=85it was put= BEFORE the SQL instead of within it=85CQL took the opposite approach but P= layOrm can also join different partitions together as well ;) ). In this case, is partId the row id of TABLE CF? Nope, partId is one of the columns. There is a test case on this class in = PlayOrm =85(notice the annotation NoSqlPartitionByThisField on the column/f= ield in the entity)=85 https://github.com/deanhiller/playorm/blob/master/input/javasrc/com/alvazan= /test/db/PartitionedSingleTrade.java PlayOrm allows partitioned tables AND non-partioned tables(non-partitioned = tables won't scale but maybe you will never have that many rows). You can = join any two combinations(non-partitioned with partitioned, non-partitioned= with non-partitioned, partition with another partition). I only prefer stackoverflow as I like referencing links/questions with thei= r urls. To reference this email is very hard later on as I have to find it= so in general, I HATE email lists ;) but it seems cassandra prefers them s= o any questions on PlayOrm you can put there and I am not sure how many on = this may or may not be interested so it creates less noise on this list too= . Later, Dean From: Marcelo Elias Del Valle >>>>>>>> Reply-To: "user@cassandra.apache.org>>>>>>>" = >>>>>>>> Date: Monday, September 24, 2012 11:07 AM To: "user@cassandra.apache.org>>>>>>>" >>>>>>>> Subject: Re: Correct model 2012/9/24 Hiller, Dean >>>>>>>> I am confused. In this email you say you want "get all requests for a user= " and in a previous one you said "Select all the users which has new reques= ts, since date D" so let me answer both=85 I have both needs. These are the two queries I need to perform on the model= . For latter, you make ONE query into the latest partition(ONE partition) of = the GlobalRequestsCF which gives you the most recent requests ALONG with th= e user ids of those requests. If you queried all partitions, you would mos= t likely blow out your JVM memory. For the former, you make ONE query to the UserRequestsCF with userid =3D to get all the requests for that user Now I think I got the main idea! This answered a lot! Sorry, I was skipping some context. A lot of the backing indexing sometime= s is done as a long row so in playOrm, too many rows in a partition means = =3D=3D too many columns in the indexing row for that partition. I believe = the same is true in cassandra for their indexing. Oh, ok, you were talking about the wide row pattern, right? But playORM is = compatible with Aaron's model, isn't it? Can I map exactly this using playO= RM? The hardest thing for me to use playORM now is I don't know Cassandra w= ell yet, and I know playORM even less. Can I ask playOrm questions in this = list? I will try to create a POC here! Only now I am starting to understand what it does ;-) The examples director= y is empty for now, I would like to see how to set up the connection with i= t. Cassandra spreads all your data out on all nodes with or without partitions= . A single partition does have it's data co-located though. Now I see. The main advantage of using partitions is keeping the indexes sm= all enough. It has nothing to do with the nodes. Thanks! If you are at 100k(and the requests are rather small), you could embed all = the requests in the user or go with Aaron's below suggestion of a UserReque= stsCF. If your requests are rather large, you probably don't want to embed= them in the User. Either way, it's one query or one row key lookup. I see it now. Multiget ignores partitions=85you feed it a LIST of keys and it gets them. = It just so happens that partitionId had to be part of your row key. Do you mean I need to load all the keys in memory to do a multiget? I have used Hector and now use Astyanax, I don't worry much about that laye= r, but I feed astyanax 3 nodes and I believe it discovers some of the other= ones. I believe the latter is true but am not 100% sure as I have not loo= ked at that code. Why did you move? Hector is being considered for being the "official" clien= t for Cassandra, isn't it? I looked at the Astyanax api and it seemed much = more high level though As an analogy on the above, if you happen to have used PlayOrm, you would O= NLY need one Requests table and you partition by user AND time(two views in= to the same data partitioned two different ways) and you can do exactly the= same thing as Aaron's example. PlayOrm doesn't embed the partition ids in= the key leaving it free to partition twice like in your case=85.and in a r= efactor, you have to map/reduce A LOT more rows because of rows having the = FK of whereas if you don't have partition id in th= e key, you only map/reduce the partitioned table in a redesign/refactor. T= hat said, we will be adding support for CQL partitioning in addition to Pla= yOrm partitioning even though it can be a little less flexible sometimes. I am not sure I understood this part. If I need to refactor, having the par= tition id in the key would be a bad thing? What would be the alternative? I= n my case, as I use userId : partitionId as row key, this might be a proble= m, right? Also, CQL locates all the data on one node for a partition. We have found = it can be faster "sometimes" with the parallelized disks that the partition= s are NOT all on one node so PlayOrm partitions are virtual only and do not= relate to where the rows are stored. An example on our 6 nodes was a join= query on a partition with 1,000,000 rows took 60ms (of course I can't comp= are to CQL here since it doesn't do joins). It really depends how much dat= a is going to come back in the query though too? There are tradeoff's betw= een disk parallel nodes and having your data all on one node of course. I guess I am still not ready for this level of info. :D In the playORM readme, we have the following: @NoSqlQuery(name=3D"findWithJoinQuery", query=3D"PARTITIONS t(:partId) SELE= CT t FROM TABLE as t "+ "INNER JOIN t.activityTypeInfo as i WHERE i.type =3D :type and t.numShares = < :shares"), What would happen behind the scenes when I execute this query? You can only= use joins with partition keys, right? In this case, is partId the row id of TABLE CF? Thanks a lot for the answers -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr