cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: Practical limit on number of column families
Date Tue, 01 Mar 2016 14:07:06 GMT
I'll defer to one of the senior committers as to whether they want that
information disseminated any further than it already is. It was
intentionally not documented since it is not recommended. If your Jira
search fu is strong enough you should be able to find it yourself, but
again, its use is strongly not recommended.

As the Jira notes, "having more than dozens or hundreds of tables defined
is almost certainly a Bad Idea."

"Bad Idea" means not good. As in don't go there. And if you do, don't
expect such a mis-adventure to be supported by the community.

-- Jack Krupansky

On Tue, Mar 1, 2016 at 8:39 AM, Vlad <qa23d-vvd@yahoo.com> wrote:

> Hi Jack,
>
> >you can reduce the overhead per table  an undocumented Jira
> Can you please point to this Jira number?
>
> >it is strongly not recommended
> What is consequences of this (besides performance degradation, if any)?
>
> Thanks.
>
>
> On Tuesday, March 1, 2016 7:23 AM, Jack Krupansky <
> jack.krupansky@gmail.com> wrote:
>
>
> 3,000 entries? What's an "entry"? Do you mean row, column, or... what?
>
> You are using the obsolete terminology of CQL2 and Thrift - column family.
> With CQL3 you should be creating "tables". The practical recommendation of
> an upper limit of a few hundred tables across all key spaces remains.
>
> Technically you can go higher and technically you can reduce the overhead
> per table (an undocumented Jira - intentionally undocumented since it is
> strongly not recommended), but... it is unlikely that you will be happy
> with the results.
>
> What is the nature of the use case?
>
> You basically have two choices: an additional cluster column to
> distinguish categories of table, or separate clusters for each few hundred
> of tables.
>
>
> -- Jack Krupansky
>
> On Mon, Feb 29, 2016 at 12:30 PM, Fernando Jimenez <
> fernando.jimenez@wealth-port.com> wrote:
>
> Hi all
>
> I have a use case for Cassandra that would require creating a large number
> of column families. I have found references to early versions of Cassandra
> where each column family would require a fixed amount of memory on all
> nodes, effectively imposing an upper limit on the total number of CFs. I
> have also seen rumblings that this may have been fixed in later versions.
>
> To put the question to rest, I have setup a DSE sandbox and created some
> code to generate column families populated with 3,000 entries each.
>
> Unfortunately I have now hit this issue:
> https://issues.apache.org/jira/browse/CASSANDRA-9291
>
> So I will have to retest against Cassandra 3.0 instead
>
> However, I would like to understand the limitations regarding creation of
> column families.
>
> * Is there a practical upper limit?
> * is this a fixed limit, or does it scale as more nodes are added into the
> cluster?
> * Is there a difference between one keyspace with thousands of column
> families, vs thousands of keyspaces with only a few column families each?
>
> I haven’t found any hard evidence/documentation to help me here, but if
> you can point me in the right direction, I will oblige and RTFM away.
>
> Many thanks for your help!
>
> Cheers
> FJ
>
>
>
>
>
>

Mime
View raw message