cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6689) Partially Off Heap Memtables
Date Tue, 04 Mar 2014 01:10:23 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918866#comment-13918866
] 

Pavel Yaskevich commented on CASSANDRA-6689:
--------------------------------------------

bq. It only starts a new thread if there is GC work to be done, room to do it in, and the
static pool doesn't have any threads that haven't timed out. The cleaner thread itself is
started once and remains forever. If you mean you're worried about flushing so many memtables
that all have GC work possible on them and that this would spam the collector thread pool,
I guess I agree it's a theoretical possibility, but it would have to be pretty extreme. We
can spin a thousand threads and not break a sweat, but even with tens of thousands of memtables
we'll hit our memory limit and stop being able to do any more GC before we cause major problems.
That said, we could safely impose a cap on the pool if we wanted. Set it largeish so it isn't
a bottleneck, but prevent this kind of problem.

What I want is to see more control over how we deal with cleaner threads, as the main point
to be able to run with smaller heaps, and thread stacks could consume ~300K we are risking
to OOM in certain situations because we don't control whole thread life time, that would also
help for people in environments guided by ulimit or similar.

bq. I will give it some thought though. On this vein, I would like to see thread local allocation
buffers, but again this is an optimisation rather than a complexity reducer.

Please do think about it, we need to try to reduce the complexity of this to minimize bus
factor :)

bq. We're currently dealing with allocations that necessarily outlive the duration of the
operation, so I'm not sure how well that would apply here, but possibly you have some specific
examples? It may be that we decide to use this code for creating the mutations from thrift/native/MS
etc., though, in which case it would be very helpful.

Exactly, maybe it's time to re-evaluate trade-off between copying data to per-allocated regions
with limited ttl vs. tracking data all around the place.

bq. Thanks. I'll incorporate these into CASSANDRA-6694 instead if that's okay with you?

Sure.

[~krummas] Have you had a chance to do more on vs. off heap performance testing?

> Partially Off Heap Memtables
> ----------------------------
>
>                 Key: CASSANDRA-6689
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6689
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>             Fix For: 2.1 beta2
>
>         Attachments: CASSANDRA-6689-small-changes.patch
>
>
> Move the contents of ByteBuffers off-heap for records written to a memtable.
> (See comments for details)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message