cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8897) Remove FileCacheService, instead pooling the buffers
Date Wed, 08 Apr 2015 10:47:12 GMT


Benedict commented on CASSANDRA-8897:

bq. for page alignment we create a bigger buffer and slice it on an aligned buffer, is there
a better way to do this?

No, but you can (and should) allocate a large block of buffers so that you only have to truncate
one unit of alignment for all buffers - say 512K/1Mb chunks, from which we slice smaller buffers.

bq. then they get evicted if they get cold.

The problem with the strategy you've taken is that we only evict entire queues, meaning we
aren't very flexible. We also evict everything if the server is quiet for a period. This could
lead to an odd situation of, say, an infrequent spurt of traffic with an uncommon page size,
with a steady drip of queries using that size, and then a 0.5s drop in the regular main type
of traffic, with this main traffic now never getting to cache its buffers. More typically
it's likely to lead to a random allocation of memory between the pools. There is also a race
condition that could leak memory.

There are a lot of ways to skin this cat, but my suggestion would be perhaps much simpler,
since we don't much mind the object allocation of the buffer wrapper, just the main body of
it. Although we could avoid that too, so here are two suggestions:


* Have a shared queue for all buffer sizes, of slabs of some size, which are page aligned
* On allocation we increment a count, slice the buffer size we need from the current slab,
and set the buffer's attachment field to the slab it's from (or, have a map from parent buffer
to slab)
* On deallocation we decrement the count, and if that's hit zero we recycle the slab
* If we want to be smart, we can have valid ranges we can slice from, but I don't think that's
necessary. One thing we can do, though, is to collect all of the buffers we need to service
a single read request upfront, so that they all have the same lifespan and we don't promote
fragmentation. Perhaps as a follow up ticket.
* If we exceed our limit, we allocate a buffer of only exactly the size we need (and don't
bother page aligning)

A little more complex (but not necessarily better):

* Have separate queue for each buffer size/type, still allocate slabs
* Maintain each slab in a globally shared LRU queue, and a local stack
* Serve requests from the top slab on the stack; when it's exhausted, pop it; when the slab
is fully (or perhaps partially, if the stack is empty) available again, push it back onto
the top of the stack
* If the stack is empty, and there is available room, allocate a new slab; otherwise deallocate
the oldest shared slab; if this slab is still in use, allocate a buffer of exactly the size
we want and non-page-aligned

These are just suggestions; there are lots of possibilities when building a cache/pool like

bq. at the moment only the compressed RAR uses direct allocation

We should probably switch all readers to use direct. In fact we should probably not allocate
heap buffers in any situation it isn't absolutely necessary.

> Remove FileCacheService, instead pooling the buffers
> ----------------------------------------------------
>                 Key: CASSANDRA-8897
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>             Fix For: 3.0
> After CASSANDRA-8893, a RAR will be a very lightweight object and will not need caching,
so we can eliminate this cache entirely. Instead we should have a pool of buffers that are

This message was sent by Atlassian JIRA

View raw message