Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B811610243 for ; Tue, 3 Dec 2013 05:09:53 +0000 (UTC) Received: (qmail 69204 invoked by uid 500); 3 Dec 2013 05:09:49 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 69132 invoked by uid 500); 3 Dec 2013 05:09:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 68943 invoked by uid 99); 3 Dec 2013 05:09:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Dec 2013 05:09:45 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of onlinespending@gmail.com designates 209.85.220.43 as permitted sender) Received: from [209.85.220.43] (HELO mail-pa0-f43.google.com) (209.85.220.43) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Dec 2013 05:09:37 +0000 Received: by mail-pa0-f43.google.com with SMTP id bj1so2420027pad.2 for ; Mon, 02 Dec 2013 21:09:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-type:content-transfer-encoding:subject:message-id:date :to:mime-version; bh=jah2MqJ8M2pv0OEhvuP9OE8elGWS5BcEgD4RnpYZfbc=; b=rH5ytbJj8Hug9/IHYd9e1CqKAlkSHFISm2UrX3nN0XzLwbpy49Ubfx7yaXO7z9R6CM Q3uhWUtMIa4p2Zd7zpF9xCDgwzE/Vz4GEyV6JNUv/FUSErqCRekN38X2t+oNYSqhheSN IfmJr5OJkbXLuTifJuL9mHoddRkS9SHZkLbrSo2Oimam9RwCS64vwF1AzPuUmfLWrS2p meFnx+RUc8gilUcIHfzqWuA2eGvr1O0GZ5Qc13+00DnxVJ2mlIe1E7VT45IlwLlaZfAi ONqylWTd0eZFSsc+0Xjf+MYMh5zkdlfxnoP6XlsTEo+qxqL8J7GO7+YQTPBwN8ESUbL5 bO+w== X-Received: by 10.68.190.103 with SMTP id gp7mr36274692pbc.74.1386047356653; Mon, 02 Dec 2013 21:09:16 -0800 (PST) Received: from bens-mbp.lan (c-76-105-193-155.hsd1.or.comcast.net. [76.105.193.155]) by mx.google.com with ESMTPSA id gg10sm126780012pbc.46.2013.12.02.21.09.15 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 02 Dec 2013 21:09:15 -0800 (PST) From: onlinespending Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Exactly one wide row per node for a given CF? Message-Id: <17D87E4D-6D94-4891-9860-94FC4E777ED6@gmail.com> Date: Mon, 2 Dec 2013 21:09:13 -0800 To: user@cassandra.apache.org Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1822\)) X-Mailer: Apple Mail (2.1822) X-Virus-Checked: Checked by ClamAV on apache.org Subject says it all. I want to be able to randomly distribute a large = set of records but keep them clustered in one wide row per node. As an example, lets say I=92ve got a collection of about 1 million = records each with a unique id. If I just go ahead and set the primary = key (and therefore the partition key) as the unique id, I=92ll get very = good random distribution across my server cluster. However, each record = will be its own row. I=92d like to have each record belong to one large = wide row (per server node) so I can have them sorted or clustered on = some other column. If I say have 5 nodes in my cluster, I could randomly assign a value of = 1 - 5 at the time of creation and have the partition key set to this = value. But this becomes troublesome if I add or remove nodes. What = effectively I want is to partition on the unique id of the record = modulus N (id % N; where N is the number of nodes). I have to imagine there=92s a mechanism in Cassandra to simply randomize = the partitioning without even using a key (and then clustering on some = column). Thanks for any help.=