cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Chaitanya Miriappalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6794) Optimise slab allocator to enable higher number of column families
Date Mon, 22 Jun 2015 07:02:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595443#comment-14595443
] 

Arun Chaitanya Miriappalli commented on CASSANDRA-6794:
-------------------------------------------------------

I completely understand that "large numbers of CFs" is an anti-pattern. But unfortunately,
in our use case we have many CFs.

Now we settled on the following approach - Use  "Off Heap Memory"

Modifications to default cassandra.yaml and cassandra-env.sh
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 * memory_allocator: JEMallocAllocator (https://issues.apache.org/jira/browse/CASSANDRA-7883)
 * memtable_allocation_type: offheap_objects

 By above two, the slab allocation (https://issues.apache.org/jira/browse/CASSANDRA-5935),
which requires
 1MB heap memory per table, is disabled. The memory for table metadata, caches and memtable
are thus
 allocated natively and does not affect GC performance.

 * tombstone_failure_threshold: 100000000
   Without this, C* throws TombstoneOverwhelmingException while in startup.
   This setting looks problematic so I want to know why just creating tables makes so many
tombstones ...

 * -XX:+UseG1GC
   It is good for reducing GC time.
   Without this, full GCs > 1s are observed.

We created 5000 column families with about 1000 entries per column family. The read/write
performance seems to stable and comparable.
The problem we saw is only with startup time.

No of CFs                                500 onHeap     5000 off Heap
Cassandra Start Time (s)	20	                 349
Average CPU Usage (%)	40	                 49.65
GC Actitivy (%)	                2.6	                 0.6

I want to know if there are any problems that are foreseen in the production environment.
Sorry, if this is not the right place to ask this question.


> Optimise slab allocator to enable higher number of column families
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-6794
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6794
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jeremy Hanna
>            Priority: Minor
>
> Currently the slab allocator allocates 1MB per column family.  This has been very beneficial
for gc efficiency.  However, it makes it more difficult to have large numbers of column families.
> It would be preferable to have a more intelligent way to allocate slabs so that there
is more flexibility between slab allocator and non-slab allocator behaviour.
> A simple first step is to ramp up size of slabs from small (say  8KB) when empty, to
1MB after a few slabs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message