Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 49210 invoked from network); 28 May 2010 03:11:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 28 May 2010 03:11:57 -0000 Received: (qmail 2140 invoked by uid 500); 28 May 2010 03:11:56 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 2084 invoked by uid 500); 28 May 2010 03:11:56 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 2076 invoked by uid 99); 28 May 2010 03:11:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 May 2010 03:11:56 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=10.0 tests=AWL,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 74.125.82.44 as permitted sender) Received: from [74.125.82.44] (HELO mail-ww0-f44.google.com) (74.125.82.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 May 2010 03:11:51 +0000 Received: by wwa36 with SMTP id 36so517812wwa.31 for ; Thu, 27 May 2010 20:11:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=KX+Zu5u1D1tz/bes0wunDiHEKWnQE4jhrCtEPfHOxBU=; b=GkgxvRTp0rBY0hgRSfe9zf3DPywvk7FeVoiSYfR83/1AVXkDZyMIWDUZFSWvypdzUQ RZmsBippkdbKAgh/mjSql5ruRtvHjdFvIg/tRPf5vC70YyVGUTauW2J2gYEdpwteBD4Y 896gF2sj9/lIS7WHNn5+h3eb0hjElTLohPJ4s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=jz8+GXXSx8hQKfjpdKa5OV0lq0sX2ryVcaHj0D1BcDNLFvEFg4uXMQ/SuiLigu4clE UrbvOFp8IHAfEO2MYro5OFJb0AWnjUc/wsJ26+BCuR5hHQPmTIhJKu08Yuu4d5vxtT9c UdVDKAZ9PIotxPENI+G9gajUdsyYn8Il+G010= Received: by 10.216.89.199 with SMTP id c49mr1223831wef.29.1275016290170; Thu, 27 May 2010 20:11:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.17.197 with HTTP; Thu, 27 May 2010 20:11:10 -0700 (PDT) In-Reply-To: <4BFED789.4050501@trackstudio.com> References: <4BFED789.4050501@trackstudio.com> From: Jonathan Ellis Date: Thu, 27 May 2010 21:11:10 -0600 Message-ID: Subject: Re: Cassandra CF sharding To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 2) is correct, but for 1) I'm not sure what manageability improvements you anticipate from dealing with multiple entities instead of one. I'm not sure what you're thinking of for 3) but routing is done by key only. 2010/5/27 Maxim Kramarenko : > Hello! > > We have mail archive with one large CF for mail body. In our case, it's easy > to shard data to 5-10 CF by customer id. We like to do this because: > > 1) We get more manageable instances, because we have many small CF instead > of one multi-TB CF on each node. > > 2) Better disk space usage (need to reserve 50% of the largest shard for > compaction only) > > 3) Can manage node load not by token only, but also by defining shards > available per node. > > Is my assumptions correct ? Any negative side effects ? > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com