Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DC16811FA4 for ; Mon, 14 Jul 2014 11:35:47 +0000 (UTC) Received: (qmail 24529 invoked by uid 500); 14 Jul 2014 11:35:44 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 24496 invoked by uid 500); 14 Jul 2014 11:35:44 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 24486 invoked by uid 99); 14 Jul 2014 11:35:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Jul 2014 11:35:44 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of isviridov@mirantis.com designates 209.85.216.49 as permitted sender) Received: from [209.85.216.49] (HELO mail-qa0-f49.google.com) (209.85.216.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Jul 2014 11:35:40 +0000 Received: by mail-qa0-f49.google.com with SMTP id dc16so3070643qab.8 for ; Mon, 14 Jul 2014 04:35:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mirantis.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=xkVKVbVvJVsDpspNsBOBHsrQCFeZbpDmarClzO7Obnw=; b=O0g4BcKkDrXPH/Xvs6Qcz0SXY5GywXjWrMCuHFVQ5WOyDAlJMzq2hmhOv+tWu3Q0Hx q5+Z1YPmSx8GDLSENcz6d6dT4Uv48mccwusjKZaNUBunhQfruOueWC3wik+VCxoMa0rO N1C7K68RywSEtstT+LQ/b/3pS4758+8bQElqo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=xkVKVbVvJVsDpspNsBOBHsrQCFeZbpDmarClzO7Obnw=; b=mqVDMHg7+wqIwDnv28T+mrLTdCTgJJ8bdqQFTFe7s2HIdP3edhc9jzIed8fwAQ3TRe d3RCeg/NVl2qZwHlxpEhOl3VmLnIUE9lcXx5GW4xYpiR90YvAKGPJ5wvY85jy36agpaq 8Y8i/f+Kj+tW7uWExqmHRGGIE6nKrlQgdcw35O4TiK3N0Uuqs2zOMKfg1GS7oZ2ZyCpo 2VlQu/OR4qNwjvUXMyqvV4ILw5sODpphOQIL8FDV69oj3/jp1IceIT7/dS9kvDnKgR1l X1AH1G977MOMu7EZSOYbC9nqYRn8D6GWtvZgKCH61BU7h5iX7AUzTGLbPEPtVRqIPYWn SPlQ== X-Gm-Message-State: ALoCoQmqGiqrB6pbU4vmQ16QP1fY825ENtbnwrSpx1SyXUcxozutjjMAn8zNAXbRn3mNkJIeEfqc MIME-Version: 1.0 X-Received: by 10.140.20.98 with SMTP id 89mr21719522qgi.33.1405337719610; Mon, 14 Jul 2014 04:35:19 -0700 (PDT) Received: by 10.96.30.70 with HTTP; Mon, 14 Jul 2014 04:35:19 -0700 (PDT) In-Reply-To: References: <790F56FE770E4286B5C27B6B282B9AF6@JackKrupansky14> Date: Mon, 14 Jul 2014 14:35:19 +0300 Message-ID: Subject: Re: keyspace with hundreds of columnfamilies From: Ilya Sviridov To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a11c126fc9138cb04fe25af71 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c126fc9138cb04fe25af71 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Tommaso, looking at your description of the architecture the idea came up. You can perform sharding on cassandra client and write to different cassandra clusters to keep the number of column families reasonable. With best regards, Ilya On Thu, Jul 3, 2014 at 10:55 PM, tommaso barbugli wrote: > thank you for the replies; I am rethinking the schema design, one possibl= e > solution is to "implode" one dimension and get N times less CFs. > With this approach I would come up with (cql) tables with up to 100 > columns; would that be a problem? > > Thank You, > Tommaso > > > 2014-07-02 23:43 GMT+02:00 Jack Krupansky : > > The official answer, engraved in stone tablets, and carried down from >> the mountain: =E2=80=9CAlthough having more than dozens or hundreds of t= ables >> defined is almost certainly a Bad Idea (just as it is a design smell in = a >> relational database), it's relatively straightforward to allow disabling >> the SlabAllocator.=E2=80=9D Emphasis on =E2=80=9Calmost certainly a Bad = Idea.=E2=80=9D >> >> See: >> https://issues.apache.org/jira/browse/CASSANDRA-5935 >> =E2=80=9CAllow disabling slab allocation=E2=80=9D >> >> IOW, this is considered an anti-pattern, but... >> >> -- Jack Krupansky >> >> *From:* tommaso barbugli >> *Sent:* Wednesday, July 2, 2014 2:16 PM >> *To:* user@cassandra.apache.org >> *Subject:* Re: keyspace with hundreds of columnfamilies >> >> Hi, >> thank you for you replies on this; regarding the arena memory is this a >> fixed memory allocation or is some sort of in memory caching? I ask beca= use >> I think that a substantial portion of the column families created will n= ot >> be queried that frequently (and some will become inactive and stay like >> that really long time) >> >> Thank you, >> Tommaso >> >> >> 2014-07-02 18:35 GMT+02:00 Romain HARDOUIN : >> >>> Arena allocation is an improvement feature, not a limitation. >>> It was introduced in Cassandra 1.0 in order to lower memory >>> fragmentation (and therefore promotion failure). >>> AFAIK It's not intended to be tweaked so it might not be a good idea to >>> change it. >>> >>> Best, >>> Romain >>> >>> tommaso barbugli a =C3=A9crit sur 02/07/2014 17:4= 0:18 : >>> >>> > De : tommaso barbugli >>> > A : user@cassandra.apache.org, >>> > Date : 02/07/2014 17:40 >>> > Objet : Re: keyspace with hundreds of columnfamilies >>> > >>> > 1MB per column family sounds pretty bad to me; is this something I >>> > can tweak/workaround somehow? >>> > >>> > Thanks >>> > Tommaso >>> > >>> >>> > 2014-07-02 17:21 GMT+02:00 Romain HARDOUIN >> >: >>> > The trap is that each CF will consume 1 MB of memory due to arena >>> allocation. >>> > This might seem harmless but if you plan thousands of CF it means >>> > thousands of mega bytes... >>> > Up to 1,000 CF I think it could be doable, but not 10,000. >>> > >>> > Best, >>> > >>> > Romain >>> > >>> > >>> > tommaso barbugli a =C3=A9crit sur 02/07/2014 >>> 10:13:41 : >>> > >>> > > De : tommaso barbugli >>> > > A : user@cassandra.apache.org, >>> > > Date : 02/07/2014 10:14 >>> > > Objet : keyspace with hundreds of columnfamilies >>> > > >>> > > Hi, >>> > > Are there any known issues, shortcomings about organising data in >>> > > hundreds of column families? >>> > > At this present I am running with 300 column families but I expect >>> > > that to get to a couple of thousands. >>> > > Is this something discouraged / unsupported (I am using Cassandra >>> 2.0). >>> > > >>> > > Thanks >>> > > Tommaso >>> >> >> > > --001a11c126fc9138cb04fe25af71 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Tommaso, looking at your description of the architecture the idea came up= .

You can perform sharding on cassandra client and write to differen= t cassandra clusters to keep the number of column families reasonable.

With best regards,
Ilya

<= br>
On Thu, Jul 3, 2014 at 10:55 PM, tommaso barb= ugli <tbarbugli@gmail.com> wrote:
thank you for the replies; = I am rethinking the schema design, one possible solution is to "implod= e" one dimension and get N times less CFs.
With this approach I would come up with (cql) tables with up to 100 columns= ; would that be a problem?

Thank You,
Tommaso


2014-07-02 23:43 GMT+02:00 J= ack Krupansky <jack@basetechnology.com>:

The official answer, engraved in stone tablets, and carried down from = the=20 mountain: =E2=80=9CAlthough having more than dozens or hundreds of tables d= efined is=20 almost certainly a Bad Idea (just as it is a design smell in a relational= =20 database), it's relatively straightforward to allow disabling the=20 SlabAllocator.=E2=80=9D Emphasis on =E2=80=9Calmost certainly a Bad Idea.= =E2=80=9D
=C2=A0
See:
=E2=80=9CAllow disabling slab allocation=E2=80=9D
=C2=A0
IOW, this is considered an anti-pattern, but...
=C2=A0
-= - Jack=20 Krupansky
=C2=A0
Sent: Wednesday, July 2, 2014 2:16 PM
To: user@cassandra.apache.org
Subject: Re: keyspace with hundreds of=20 columnfamilies
=C2=A0
Hi,=20
thank you for you replies on this; regarding the arena memory is this = a=20 fixed memory allocation or is some sort of in memory caching? I ask because= I=20 think that a substantial portion of the column families created will not be= =20 queried that frequently (and some will become inactive and stay like that r= eally=20 long time)
=C2=A0
Thank you,
Tommaso


2014-07-02 18:35 GMT+02:00 Romain HARDOUIN <romain.hardouin@urssaf.fr>:
Arena alloc= ation is an improvement feature, not a limitation.=20
It was introduced in Cassandra 1.0 i= n order=20 to lower memory fragmentation (and therefore promotion failure).= =20
AFAIK It's not intended to be tweaked s= o it might=20 not be a good idea to change it.

Best,
Romain=20

tommaso barbugli <tbarbugli@gmail.com> a =C3=A9crit sur 02/07/2014 17= :40:18=20 :

> De : tommaso barbugli <tbarbugli@gmail.com>

> A : <= a href=3D"mailto:user@cassandra.apache.org" target=3D"_blank">user@cassandr= a.apache.org,
> Date :=20 02/07/2014 17:40
> Objet : Re: keyspace with hundreds of= =20 columnfamilies
>
> 1MB per column family sounds pretty bad to me;=20 is this something I
> can tweak/workaround somehow?

&= gt;=20
> Thanks

> Tommaso
>=20

> 2014-07-02 17:21 GMT+02:00 Romain HARDOUIN <romain.hardouin@= urssaf.fr>:
> The trap is=20 that each CF will consume 1 MB of memory due to arena allocation.
>= ;=20 This might seem harmless but if you plan thousands of CF it means
>= ;=20 thousands of mega bytes...
> Up to 1,000 CF I think it could be do= able,=20 but not 10,000.
>
> Best,
>
> Romain
>= =20
>
> tommaso barbugli <tbarbugli@gmail.com> a =C3=A9crit sur 02/07/2= 014 10:13:41=20 :
>
> > De : tommaso barbugli <tbarbugli@gmail.com>=20
> > A : user@cassandra.apache.org,
> > Date : 02/07/2014=20 10:14
> > Objet : keyspace with hundreds of columnfamilies=20

> >
> > Hi,
> > Are there any know= n=20 issues, shortcomings about organising data in
> > hundreds of c= olumn=20 families?
> > At this present I am running with 300 column fami= lies=20 but I expect
> > that to get to a couple of thousands.
>= >=20 Is this something discouraged / unsupported (I am using Cassandra 2.0).= =20
> >
> > Thanks
> >=20 Tommaso
=C2=A0


--001a11c126fc9138cb04fe25af71--