Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 24516 invoked from network); 19 Nov 2010 15:31:02 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 19 Nov 2010 15:31:02 -0000 Received: (qmail 48297 invoked by uid 500); 19 Nov 2010 15:31:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 48133 invoked by uid 500); 19 Nov 2010 15:31:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 48125 invoked by uid 99); 19 Nov 2010 15:31:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Nov 2010 15:31:31 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chen.daqi@gmail.com designates 209.85.161.44 as permitted sender) Received: from [209.85.161.44] (HELO mail-fx0-f44.google.com) (209.85.161.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Nov 2010 15:31:25 +0000 Received: by fxm3 with SMTP id 3so2820308fxm.31 for ; Fri, 19 Nov 2010 07:31:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:from:date:message-id :subject:to:content-type:content-transfer-encoding; bh=8jqLUcOp2GbntBMQtXMz4t5mm8d72E9XXTGH5VKZrNU=; b=QFJlnRtRtR8ilequW3Ql1r1FNPrONItVcTj/ccXgXz09gJLDEmj6p3UQ5PKz4lDiRh Ul8KjdRZIjd0UuiYxe9bs8Zx9iX5Q/8819e0n69j8iHqnmZ+FVhxXwbqGaxJIJKTtB8D HvVjtl17N0YsuTCzGoX3vajjBq2Sb1uLKOY0M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type :content-transfer-encoding; b=NW/lXuvleBDkHL1oBIaYB6pgF6muZZIx801+n3hCgQR/XI0oj8JfGR8eBIzvXUm8n0 tHExlh+Omg2u6pIcNnmaXc75KGBg4YC/Peg5hkyxsclJTXG0RVuBvEU0/5aBDt9M8A02 AOFlxiJjMlNlyuVGpRp2Car1N1EMHEUsZby28= Received: by 10.223.122.146 with SMTP id l18mr1021766far.102.1290180664132; Fri, 19 Nov 2010 07:31:04 -0800 (PST) MIME-Version: 1.0 From: Chen Xinli Date: Fri, 19 Nov 2010 07:31:04 -0800 Message-ID: <-7753762334317987315@unknownmsgid> Subject: Re: Data model design question To: Nanheng Wu , "user@cassandra.apache.org" Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable =B2=E9=D1=AF=CA=C7=CD=EA=C8=AB=BB=F9=D3=DA=B5=B1=CC=EC=BC=C6=CB=E3=B5=C4=CA= =FD=BE=DD=C2=F0 =BB=B9=CA=C7=BB=F9=D3=DA=B5=B1=CC=EC=BA=CD=C0=FA=CA=B7=CA= =FD=BE=DD=B5=C4=BA=CF=B2=A2=A3=BF Nanheng Wu =B1=E0=D0=B4=A3=BA Hi, Our team decided to use Cassandra as storage solution to a dataset. I am very new to the NoSQL world and Cassandra so I am hoping to get some help from the community: The dataset is pretty simple, we have for each key a number of columns with values. Each day we compute a new version of this dataset, the new version will mostly update existing keys but could also add and delete some keys. (And we'll build a service that queries Cassandra). A key requirement for us is we want to keep versions of the dataset and keep N versions around, this is in case we discover problems in the current version and need to "roll up" to an older one. I thought about creating a Column Family per version, this means we will create a new column family every day and occasionally delete column families according to some truncation policy. I know Cassandra 0.7 now makes changing schema easier, but is this a good way to go? I would really like to hear what you guys think is the better way to handle this. Thank you. Best, Alex