cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables
Date Tue, 08 Apr 2014 23:32:23 GMT


Benedict commented on CASSANDRA-6694:

bq. I'm getting mixed signals here, are you claiming that JVM does a bad job or OOP is broken
in general? Also CASSANDRA-6993 seems to point to a different problem.

I'm saying performance critical code is impacted when you have virtual method calls that cannot
be optimised by the VM (i.e. those with multiple implementations). I meant CASSANDRA-6553
and CASSANDRA-6934

bq. We can have a Cell separate implementation with multiple buffers as Thrift allocates them
anyway which we are going to be transformed to linear ones once they get into memtable as
we have to reallocate there.

Then what exactly do we win? We still have to have two hierarchies and the same modularisation.
Also the potential ease of optimisations for comparison disappear, and we still have increased
indirection and virtual method call costs. If this is the suggestion, I am very -1, as the
payoff is very small, the work nontrivial and the negatives substantial.

> Slightly More Off-Heap Memtables
> --------------------------------
>                 Key: CASSANDRA-6694
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 2.1 beta2
> The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap
overhead is still very large. It should not be tremendously difficult to extend these changes
so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their
associated overhead).
> The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per
cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This
translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the
VM to allow us to address a reasonably large memory space, although this trick is unlikely
to last us forever, at which point we will have to bite the bullet and accept a 24-byte per
cell overhead), and 4-byte object reference for maintaining our internal list of allocations,
which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph
we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting.
> The ugliest thing here is going to be implementing the various CellName instances so
that they may be backed by native memory OR heap memory.

This message was sent by Atlassian JIRA

View raw message