cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6694) Slightly More Off-Heap Memtables
Date Thu, 17 Apr 2014 01:47:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13972205#comment-13972205
] 

Pavel Yaskevich edited comment on CASSANDRA-6694 at 4/17/14 1:45 AM:
---------------------------------------------------------------------

So here is the [branch|https://github.com/xedin/cassandra/compare/CASSANDRA-6694] which implements
my idea of how to get rid of the Impl classes for Cell (+ does optimized updateDigest for
both Cell implementations and couple of other things), I left DecoratedKey alone for now,
work not fully complete yet but only couple on nit things are missing - I need to change couple
of places to use CFMetaData and clone native cells so I decided not to do it if we are not
going to go with that code.

Regarding [~benedict]'s reorg branch I found couple of problems:

# internalGetLong(long, long) is actually meant to be internalSetLong(long, long) in AbstractMemory;
# CounterUpdateCell should be BufferCounterUpdateCell as it extends BufferCell
# CounterUpdateCell interface is missing as well as NativeCounterUpdateCell implementation
to match it.
# in e.g. NativeExpiringCell there is no need to declare that it implements CellName as NativeCell
already does it.
# Impl classes extends another Impl classes which doesn't make much sense as all of the methods
are static.

bq. Why do you say "no real reason"? This is the serialization format, so we have to convert
to it. That's the definition of what toByteBuffer() should return. We only call it when writing
to disk or to the network, and is no different from the original implementation in that regard.
That's not to say with time we cannot change this, but there's not much we can do yet.

When taken out of context like that it doesn't really make sense but what I meant, there are
situation where we don't really need to get BB from the CellName but can transfer bytes directly
(especially for the native cell implementations). 

bq. I construct it using unsafe, which skips all constructors. So there is no synchronization
or PhantomReference creation.

Right, we should be good there, my bad.


was (Author: xedin):
So here is the [branch|https://github.com/xedin/cassandra/compare/CASSANDRA-6694] which implements
my idea of how to get rid of the Impl classes for Cell (+ does optimized updateDigest for
both Cell implementations and couple of other things), I left DecoratedKey alone for now,
work not fully complete yet but only couple on nit things are missing - I need to change couple
of places to use CFMetaData and clone native cells so I decided not to do it if we are not
going to go with that code.

Regarding [~benedict]'s reorg branch I found couple of problems:

# internalGetLong(long, long) is actually meant to be internalSetLong(long, long) in AbstractMemory;
# CounterUpdateCell should be BufferCounterUpdateCell as it extends BufferCell
# CounterUpdateCell interface is missing as well as NativeCounterUpdateCell implementation
to match it.

bq. Why do you say "no real reason"? This is the serialization format, so we have to convert
to it. That's the definition of what toByteBuffer() should return. We only call it when writing
to disk or to the network, and is no different from the original implementation in that regard.
That's not to say with time we cannot change this, but there's not much we can do yet.

When taken out of context like that it doesn't really make sense but what I meant, there are
situation where we don't really need to get BB from the CellName but can transfer bytes directly
(especially for the native cell implementations). 

bq. I construct it using unsafe, which skips all constructors. So there is no synchronization
or PhantomReference creation.

Right, we should be good there, my bad.

> Slightly More Off-Heap Memtables
> --------------------------------
>
>                 Key: CASSANDRA-6694
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 2.1 beta2
>
>
> The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as the on-heap
overhead is still very large. It should not be tremendously difficult to extend these changes
so that we allocate entire Cells off-heap, instead of multiple BBs per Cell (with all their
associated overhead).
> The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 bytes per
cell on average for the btree overhead, for a total overhead of around 20-22 bytes). This
translates to 8-byte object overhead, 4-byte address (we will do alignment tricks like the
VM to allow us to address a reasonably large memory space, although this trick is unlikely
to last us forever, at which point we will have to bite the bullet and accept a 24-byte per
cell overhead), and 4-byte object reference for maintaining our internal list of allocations,
which is unfortunately necessary since we cannot safely (and cheaply) walk the object graph
we allocate otherwise, which is necessary for (allocation-) compaction and pointer rewriting.
> The ugliest thing here is going to be implementing the various CellName instances so
that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message