Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3C598675F for ; Sun, 5 Jun 2011 18:44:06 +0000 (UTC) Received: (qmail 20259 invoked by uid 500); 5 Jun 2011 18:44:04 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 20205 invoked by uid 500); 5 Jun 2011 18:44:04 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 20197 invoked by uid 99); 5 Jun 2011 18:44:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 05 Jun 2011 18:44:04 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of cryptcom@gmail.com designates 209.85.210.172 as permitted sender) Received: from [209.85.210.172] (HELO mail-iy0-f172.google.com) (209.85.210.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 05 Jun 2011 18:43:59 +0000 Received: by iyn15 with SMTP id 15so3925438iyn.31 for ; Sun, 05 Jun 2011 11:43:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=womAN43LPcIDPDBG1eBTufzgh9JGtsMjUNKddZlXqIY=; b=wNySIIkBUY8NikZhFPK60cITGy3K9degKgv3/5ZDSDBA1fOfhWk/2YUvYbuQcKEQ8X 0MFCXpPqlg+Pn1+yd7r3gijwcytsJ0PI1AeWeJ9/8NaQ6g3evzOdayEpuZiQ3AjKV9wh ujX7RjaYn6JM6/+Fb0LhhAetQ3iRSvdRNt2o8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=bGr350gfHXsZDQPE//2JbAlbWDxdjgm4cUk3dSpUL0l34d567xga+5onSBw541afyh tiXIJpOV/fM197j1ikrs+ZbDB0a7Y9D9v2cUu+i6hx50nB7AQWzpBNE2Tr+rt/kZd088 eKtHaxZaUFwPtJ9xNhDYFyjxmz3BF8Y4cFWMo= MIME-Version: 1.0 Received: by 10.42.29.137 with SMTP id r9mr7512252icc.227.1307299418683; Sun, 05 Jun 2011 11:43:38 -0700 (PDT) Received: by 10.42.41.131 with HTTP; Sun, 5 Jun 2011 11:43:38 -0700 (PDT) Date: Sun, 5 Jun 2011 14:43:38 -0400 Message-ID: Subject: Paging Columns from a Row From: Joseph Stein To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf3043493077386e04a4fb5dde --20cf3043493077386e04a4fb5dde Content-Type: text/plain; charset=ISO-8859-1 What is the best practices here to page and slice columns from a row. So lets say I have 1,000,000 columns in a row I read the row but want to have 1 thread read columns 0 - 9999, second thread (actor in my case) 10000 - 19999 ... and so on so i can have 100 workers processing 10,000 columns for each of my rows. If there is no API for this then is it something I should a composite key on and have to populate the rows with a counter 0000000:myoriginalcolumnnameX 0000001:myoriginalcolumnnameY 0000002:myoriginalcolumnnameZ Going the composite key route and doing a start/end predicate would work but then it kind of makes the insertion/load of this have to go through a single synchronized point to generate the columns names... I am not opposed to this but would prefer both the load of my data and processing of my data to not be bound by any 1 single lock (even if distributed). Thanks!!!! /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */ --20cf3043493077386e04a4fb5dde Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable What is the best practices here to page and slice columns from a row.
<= br>
So lets say I have 1,000,000 columns in a row

<= /div>
I read the row but want to have 1 thread read columns 0 - 9999, s= econd thread (actor in my case) 10000 - 19999 ... and so on so i can have 1= 00 workers processing 10,000 columns for each of my rows.

If there is no API for this then is it something I shou= ld a composite key on and have to populate the rows with a counter

0000000:myoriginalcolumnnameX
0000001:myoriginalcolumnnameY
0000002:myoriginalcolumnnameZ

Going the composite key route and doing a start/end pr= edicate would work but then it kind of makes the insertion/load of this hav= e to go through a single=A0synchronized=A0point to generate the columns nam= es... I am not opposed to this but would prefer both the load of my data an= d processing of my data to not be bound by any 1 single lock (even if distr= ibuted).

Thanks!!!!

/*
Joe Stein
http://www.linkedin.com/in/charmallo= c
Twitter: @allthingshadoop
*/
--20cf3043493077386e04a4fb5dde--