cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-7282) Faster Memtable map
Date Thu, 05 Jun 2014 20:00:02 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benedict updated CASSANDRA-7282:
--------------------------------

    Description: 
Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in our memtables.
Maintaining this is an O(lg(n)) operation; since the vast majority of users use a hash partitioner,
it occurs to me we could maintain a hybrid ordered list / hash map. The list would impose
the normal order on the collection, but a hash index would live alongside as part of the same
data structure, simply mapping into the list and permitting O(1) lookups and inserts.

I've chosen to implement this initial version as a linked-list node per item, but we can optimise
this in future by storing fatter nodes that permit a cache-line's worth of hashes to be checked
at once,  further reducing the constant factor costs for lookups.

  was:
Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in our memtables.
Maintaining this is an O(lg(n)) operation; since the vast majority of users use a hash partitioner,
it occurs to me we could maintain a hybrid skip list / hash map. The skip list would impose
the order on the collection, but a hash index would live alongside as part of the same data
structure, simply mapping into the skip list and permitting O(1) lookups. It should be possible
to define the hash map to also permit O(1) inserts. Our decorated keys are in fact perfectly
designed for this scheme.

At the same time, we can potentially improve the data locality in the skip list by baking
the initial 64 token bits directly into the structure, and storing multiple values per skip
list entry to improve cache performance, bringing down memory and constant factor cpu overheads.


> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>             Fix For: 3.0
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in our
memtables. Maintaining this is an O(lg(n)) operation; since the vast majority of users use
a hash partitioner, it occurs to me we could maintain a hybrid ordered list / hash map. The
list would impose the normal order on the collection, but a hash index would live alongside
as part of the same data structure, simply mapping into the list and permitting O(1) lookups
and inserts.
> I've chosen to implement this initial version as a linked-list node per item, but we
can optimise this in future by storing fatter nodes that permit a cache-line's worth of hashes
to be checked at once,  further reducing the constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message