From user-return-14135-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Mar 02 08:43:50 2011 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 42281 invoked from network); 2 Mar 2011 08:43:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Mar 2011 08:43:50 -0000 Received: (qmail 26052 invoked by uid 500); 2 Mar 2011 08:43:48 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 25886 invoked by uid 500); 2 Mar 2011 08:43:45 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 25878 invoked by uid 99); 2 Mar 2011 08:43:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Mar 2011 08:43:44 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sylvain@datastax.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gx0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Mar 2011 08:43:38 +0000 Received: by gxk5 with SMTP id 5so2634680gxk.31 for ; Wed, 02 Mar 2011 00:43:17 -0800 (PST) MIME-Version: 1.0 Received: by 10.151.7.13 with SMTP id k13mr10204299ybi.373.1299055397013; Wed, 02 Mar 2011 00:43:17 -0800 (PST) Received: by 10.146.86.1 with HTTP; Wed, 2 Mar 2011 00:43:16 -0800 (PST) X-Originating-IP: [88.183.33.171] In-Reply-To: <0D7E3391-7036-4CDA-AF81-0A2EF160C10F@cuttshome.net> References: <0D7E3391-7036-4CDA-AF81-0A2EF160C10F@cuttshome.net> Date: Wed, 2 Mar 2011 09:43:16 +0100 Message-ID: Subject: Re: limit on rows in a cf From: Sylvain Lebresne To: user@cassandra.apache.org Cc: Shaun Cutts Content-Type: multipart/alternative; boundary=000e0cd519947b9316049d7be718 --000e0cd519947b9316049d7be718 Content-Type: text/plain; charset=ISO-8859-1 On Tue, Mar 1, 2011 at 10:36 PM, Shaun Cutts wrote: > This isn't quite true, I think. RandomPartitioner uses MD5. So if you had > 10^16 rows, you would have a 10^-6 chance of a collision, according to > http://en.wikipedia.org/wiki/Birthday_attack ... and apparently MD5 isn't > quite balanced, so your actual odds of a collision are worse (though I'm not > familiar with the literature). > > 10^16 is very large... but conceivable, I guess. > MD5's are used for the distribution of key to nodes. So in theory you can have multiple keys having the same token (md5). This means they'll be sure to go into the same node but that's all. But in all fairness, Cassandra don't live up to the theory quite yet, and though you can have multiple keys for the same MD5, some read operations (range_slice) will be buggy when that happens: see https://issues.apache.org/jira/browse/CASSANDRA-1034 that should (hopefully) be fixed soon. What is true however is that you can't have more than 2^128 nodes with RandomPartitioner (one for each MD5). But I'm really curious to see someone hit that limit. Btw, I'm not pretending Cassandra has no limit or anything that bold, merely saying that I'm pretty sure the number of rows is not a concern. -- Sylvain > > -- Shaun > > > On Feb 16, 2011, at 4:05 AM, Sylvain Lebresne wrote: > > Sky is the limit. > > Columns in a row are limited to 2 billion because the size of a row is > recorded in a java int. A row must also fit on one node, so this also limit > in a way the size of a row (if you have large values, you could be limited > by this factor much before reaching 2 billions columns). > > The number of rows is never recorded anywhere (no data type limit). And > rows are balanced over the cluster. So there is no real limit outside what > your cluster can handle (that is the number of machine you can afford is > probably the limit). > > Now, if a single node holds a huge number of rows, the only factor that > comes to mind is that the sparse index kept in memory for the SSTable can > start to take too much memory (depending on how much memory you have). In > which case you can have a look at index_interval in cassandra.yaml. But as > long as you don't start seeing node EOM for no reason, this should not be a > concern. > > -- > Sylvain > > On Wed, Feb 16, 2011 at 9:36 AM, Sasha Dolgy wrote: > >> >> is there a limit or a factor to take into account when the number of rows >> in a CF exceeds a certain number? i see the columns for a row can get >> upwards of 2 billion ... can i have 2 billion rows without much issue? >> >> -- >> Sasha Dolgy >> sasha.dolgy@gmail.com >> > > > --000e0cd519947b9316049d7be718 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Tue, Mar 1, 2011 at 10:36 PM, Shaun Cutts <shaun@cuttshome.net> wrote:
This isn't quite true, I think= . RandomPartitioner uses MD5. So if you had 10^16 rows, you would have a 10= ^-6 chance of a collision, according to=A0http://en.wikipedia.org/wiki/Birt= hday_attack=A0... and apparently MD5 isn't quite balanced, so your = actual odds of a collision are worse (though I'm not familiar with the = literature).

10^16 is very large... but conceivable, I guess.
<= /div>

MD5's are used for the distributi= on of key to nodes. So in theory you can have multiple keys having the same= token (md5). This means they'll be sure to go into the same node but t= hat's all. But in all fairness, Cassandra don't live up to the theo= ry quite yet, and though you can have multiple keys for the same MD5, some = read operations (range_slice) will be buggy when that happens: see=A0https://issues.a= pache.org/jira/browse/CASSANDRA-1034 that should (hopefully) be fixed s= oon.

What is true however is that you can't have more th= an 2^128 nodes with RandomPartitioner (one for each MD5). But I'm reall= y curious to see someone hit that limit.
Btw, I'm not pretend= ing Cassandra has no limit or anything that bold, merely saying that I'= m pretty sure the number of rows is not a concern.

--
Sylvain=A0

=A0

-- Shaun


On Feb 16, 2011, at 4:05 AM, Sylvain Lebresne wrote:
Sky is the limit.

Columns in = a row are limited to 2 billion because the size of a row is recorded in a j= ava int. A row must also fit on one node, so this also limit in a way the s= ize of a row (if you have large values, you could be limited by this factor= much before reaching 2 billions columns).

The number of rows is never recorded anywhere (no data = type limit). And rows are balanced over the cluster. So there is no real li= mit outside what your cluster can handle (that is the number of machine you= can afford is probably the limit).

Now, if a single node holds a huge number of rows, the = only factor that comes to mind is that the sparse index kept in memory for = the SSTable can start to take too much memory (depending on how much memory= you have). In which case you can have a look at index_interval in cassandr= a.yaml. But as long as you don't start seeing node EOM for no reason, t= his should not be a concern.=A0

--
Sylvain

On Wed, Feb 16, 2011 at 9:36 AM, Sasha Dolgy <<= a href=3D"mailto:sdolgy@gmail.com" target=3D"_blank">sdolgy@gmail.com&g= t; wrote:
=A0
is there a limit or a factor to take into account when the number of r= ows in a CF exceeds a certain number?=A0 i see the columns for a row can ge= t upwards of 2 billion ... can i have 2 billion rows without much issue?=A0=

--
Sasha Dolgy
sasha.dolgy@gmail.com



--000e0cd519947b9316049d7be718--