Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BE07510FDB for ; Thu, 23 Jan 2014 08:34:56 +0000 (UTC) Received: (qmail 89031 invoked by uid 500); 23 Jan 2014 08:34:54 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 88176 invoked by uid 500); 23 Jan 2014 08:34:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 88162 invoked by uid 99); 23 Jan 2014 08:34:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jan 2014 08:34:51 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [209.85.216.174] (HELO mail-qc0-f174.google.com) (209.85.216.174) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jan 2014 08:34:44 +0000 Received: by mail-qc0-f174.google.com with SMTP id x13so2056122qcv.33 for ; Thu, 23 Jan 2014 00:34:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=GOFTfxs984v4fNKYYI/HbFZ1FbE06cD+wXr7CjY/jnA=; b=GzM3OMFZzr6JFglHOm4ECt25fX2bPpcg87Az2zg5gkvj5XibP1i1/F6MkihccmdNQI TXI/yHRIOA7lOBYRpdbNd0JxfHGYchOj2oerGINtdz7OQJUnrjJ+6lcePH1DcfGPFnke qi+WlpK+ieK+id0o1NpIP2rlNDzfFMJYQHb4rnwWlJJT1HV0I2pX2gLfJ+IQcPu3YWEa VSv10ejd6yT6UI8IbzhF8bdJV3HVYF8tmG9B+bQNCB/efsdqvXbpk3PK1K2iePpf+RiQ +iGklMUp7lN2UzXT4jtZK3+qSiIu4/0nK84G+lXgdtNSd9BjOw1WA/8IBumFe9KtuDfK 1NmA== X-Gm-Message-State: ALoCoQnR3kekWzdgED98IPEPHQ+nFuHlR0EozB7VCGgggf/lJswkJ1XHYqwxfZrwp5A1v4mZowCW MIME-Version: 1.0 X-Received: by 10.140.98.33 with SMTP id n30mr9102535qge.8.1390466063781; Thu, 23 Jan 2014 00:34:23 -0800 (PST) Received: by 10.96.143.35 with HTTP; Thu, 23 Jan 2014 00:34:23 -0800 (PST) In-Reply-To: References: Date: Thu, 23 Jan 2014 09:34:23 +0100 Message-ID: Subject: Re: Datamodel for a highscore list From: Kasper Middelboe Petersen To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a113aaa3ecdfc4504f09f1bee X-Virus-Checked: Checked by ClamAV on apache.org --001a113aaa3ecdfc4504f09f1bee Content-Type: text/plain; charset=ISO-8859-1 What would the consequence be of having this updated highscore table (using friendId as part of the clustering index to avoid name collisions): CREATE TABLE highscore ( userId uuid, score int, friendId uuid, name varchar, PRIMARY KEY(userId, score, friendId) ) WITH CLUSTERING ORDER BY (score DESC); And then create an index: CREATE INDEX friendId_idx ON highscore ( friendId ); The table will have many million (I should expect 100+ million) entries. Each friendId would appear as many times as the user has friends. It sounds like a scenario where I should take care of using a custom index. I haven't worked with custom indexes in Cassandra before, but I assume this would allow me to query the table based on (userId, friendId) for updating highscores. But what would happen in this case? What queries would be affected and roughly to what degree? Would this be a viable option? On Wed, Jan 22, 2014 at 6:44 PM, Kasper Middelboe Petersen < kasper@sybogames.com> wrote: > Hi! > > I'm a little worried about the data model I have come up with for handling > highscores. > > I have a lot of users. Each user has a number of friends. I need a > highscore list pr friend list. > > I would like to have it optimized for reading the highscores as opposed to > setting a new highscore as the use case would suggest I would need to read > the list a lot more than I would need write new highscores. > > Currently I have the following tables: > CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo > int, PRIMARY KEY(userId)) > CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY > KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC); > ... and a tables for friends - for the purpose of this mail assume > everyone is friends with everyone else > > Reading the highscore list for a given user is easy. SELECT * FROM > highscores WHERE userId = . > > Problem is setting a new highscore. > 1. I need to read-before-write to get the old score > 2. I'm screwed if something goes wrong and the old score gets overwritten > before all the friends highscore lists gets updated - and it is an highly > visible error due to the same user is on the highscore multiple times. > > I would very much appreciate some feedback and/or alternatives to how to > solve this with Cassandra. > > > Thanks, > Kasper > --001a113aaa3ecdfc4504f09f1bee Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
What would the consequence be of having this updated highs= core table (using friendId as part of the clustering index to avoid name co= llisions):

CREATE TABLE highscore (
=A0 userId= uuid,
=A0 score int,
=A0 friendId uuid,
=A0 name v= archar,
=A0 PRIMARY KEY(userId, score, friendId)
) WITH CLUSTERING ORDE= R BY (score DESC);

=
And then create an index:

CREATE INDEX friendId_idx ON highscore ( frien= dId );

The table will have many millio= n (I should expect 100+ million) entries. Each friendId would appear as man= y times as the user has friends. It sounds like a scenario where I should t= ake care of using a custom index.

I haven't worked with custom in= dexes in Cassandra before, but I assume this would allow me to query the ta= ble based on (userId, friendId) for updating highscores.

But what would happen in this case? Wh= at queries would be affected and roughly to what degree?

Would this be a viable option?



On Wed, Jan 22, 2014 at 6:44 PM, K= asper Middelboe Petersen <kasper@sybogames.com> wrote:
Hi!

I= 9;m a little worried about the data model I have come up with for handling = highscores.

I have a lot of users. Each user has a number of friend= s. I need a highscore list pr friend list.

I would like to have it optimized for reading the highs= cores as opposed to setting a new highscore as the use case would suggest I= would need to read the list a lot more than I would need write new highsco= res.

Currently I have the following tables:
CREATE= TABLE user (userId uuid, name varchar, highscore int, bestcombo int, PRIMA= RY KEY(userId))
CREATE TABLE highscore (userId uuid, score int, n= ame varchar, PRIMARY KEY(userId, score, name)) WITH CLUSTERING ORDER BY (sc= ore DESC);
... and a tables for friends - for the purpose of this mail assu= me everyone is friends with everyone else

Reading = the highscore list for a given user is easy. SELECT * FROM highscores WHERE= userId =3D <id>.

Problem is setting a new highscore.
1. I need= to read-before-write to get the old score
2. I'm screwed if = something goes wrong and the old score gets overwritten before all the frie= nds highscore lists gets updated - and it is an highly visible error due to= the same user is on the highscore multiple times.

I would very much appreciate some feedback and/or alter= natives to how to solve this with Cassandra.

<= br>
Thanks,
Kasper

--001a113aaa3ecdfc4504f09f1bee--