cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7282) Faster Memtable map
Date Sat, 13 Sep 2014 19:44:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132913#comment-14132913
] 

Branimir Lambov commented on CASSANDRA-7282:
--------------------------------------------

This makes sense. Thanks for clarifying, the fact that the list is ordered by the Token escaped
me. Since this is the case, should we restrict this method to Murmur tokens only? For anything
that sorts naturally we could just grab the first four bytes. I wouldn't even call this a
hashCode to avoid the confusion, perhaps an "ordering key" or "ordering prefix"?

I would still change the condition in the preface to use non-strict inequality, though, because
cropping tokens to 32 bits will introduce collisions.

> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: profile.yaml, reads.svg, run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in our
memtables. Maintaining this is an O(lg(n)) operation; since the vast majority of users use
a hash partitioner, it occurs to me we could maintain a hybrid ordered list / hash map. The
list would impose the normal order on the collection, but a hash index would live alongside
as part of the same data structure, simply mapping into the list and permitting O(1) lookups
and inserts.
> I've chosen to implement this initial version as a linked-list node per item, but we
can optimise this in future by storing fatter nodes that permit a cache-line's worth of hashes
to be checked at once,  further reducing the constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message