Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1FF097501 for ; Sun, 17 Jul 2011 00:57:53 +0000 (UTC) Received: (qmail 83065 invoked by uid 500); 17 Jul 2011 00:57:51 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 82981 invoked by uid 500); 17 Jul 2011 00:57:50 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 82973 invoked by uid 99); 17 Jul 2011 00:57:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Jul 2011 00:57:50 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of driftx@gmail.com designates 209.85.161.43 as permitted sender) Received: from [209.85.161.43] (HELO mail-fx0-f43.google.com) (209.85.161.43) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Jul 2011 00:57:45 +0000 Received: by fxg17 with SMTP id 17so4114475fxg.30 for ; Sat, 16 Jul 2011 17:57:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=Yb3Y90fEc3liwDrfEHUmYSG/D39X69birUuBQv+ZhWg=; b=ubaGcfFtxMVvcemtJpgv3pyNGmhs3QmFUMeNegHLpHoTSnNS0fTbYtRsPEtX0OxnfF 58w1WzdkDqzVD/VLY0NjVXD77n2jOOvsc448uiX8M11FLD1zDRHBw0KZWYtD/bACYqZv 53cfTTjHAeSmjZ4PrQ0jifmJf4oB/e2n3IdR8= Received: by 10.204.82.16 with SMTP id z16mr1358482bkk.241.1310864244144; Sat, 16 Jul 2011 17:57:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.72.69 with HTTP; Sat, 16 Jul 2011 17:57:04 -0700 (PDT) In-Reply-To: References: From: Brandon Williams Date: Sat, 16 Jul 2011 19:57:04 -0500 Message-ID: Subject: Re: Help with schema modelling To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 On Sat, Jul 16, 2011 at 7:08 PM, Tristan Seligmann wrote: > I'm trying to model a schema for a logging storage system in > Cassandra: Log messages consist of a timestamp, message, and some > other arbitrary key/value pairs. Querying would primarily be done > based on timestamp ranges; I will probably be doing filtering based on > matches against the key/value pairs as well, but I expect that will be > handled by fetching the messages in the desired time range, then > filtering out the uninteresting ones. I recommend reading this: http://blog.insidesystems.net/basic-time-series-with-cassandra > A supercolumn makes it easy enough to store the key/value pairs as > columns, but then I end up with all of my log messages in a single > row, which obviously won't work. On the other hand, if I use the > timestamp as the row key, I need to use OPP to query on ranges, and > I'd prefer not to deal with the balancing issues that would raise. I > suppose I could go halfway; use a prefix of the timestamp (eg. date + > hour, or perhaps date + hour + minute) as the key, and then retrieve > all of the keys in the range I'm interested in when performing a > query. Do the latter and avoid OPP. Chunking by hour should be sufficient in most cases. -Brandon