cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7282) Faster Memtable map
Date Wed, 17 Sep 2014 12:52:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136908#comment-14136908
] 

Benedict edited comment on CASSANDRA-7282 at 9/17/14 12:52 PM:
---------------------------------------------------------------

Some more numbers, with a warmup dataset to populate the map so that variability due to throughput
rate is reduced. These numbers show the NBHOM consistently around 3x+ faster, although it
introduces per-run variability due to GC.

{noformat}
Benchmark                            (readWriteRatio)  (type)  (warmup)   Mode  Samples  
  Score  Score error   Units
b.b.c.HashOrderedCollections.test                 0.9    CSLM   5000000  thrpt        5  1392.273
    2918.717  ops/ms
b.b.c.HashOrderedCollections.test                 0.9   NBHOM   5000000  thrpt        5  5088.408
    8964.885  ops/ms
b.b.c.HashOrderedCollections.test                 0.5    CSLM   5000000  thrpt        5  1128.637
    2589.679  ops/ms
b.b.c.HashOrderedCollections.test                 0.5   NBHOM   5000000  thrpt        5  3406.299
    5606.085  ops/ms
b.b.c.HashOrderedCollections.test                 0.1    CSLM   5000000  thrpt        5  
924.642     1802.045  ops/ms
b.b.c.HashOrderedCollections.test                 0.1   NBHOM   5000000  thrpt        5  3311.107
     999.896  ops/ms
b.b.c.HashOrderedCollections.test                   0    CSLM   5000000  thrpt        5  
939.757     1776.641  ops/ms
b.b.c.HashOrderedCollections.test                   0   NBHOM   5000000  thrpt        5  2781.503
    4723.844  ops/ms
{noformat}

edit: same principle but fewer items warmed up, so less variability due to GC:

{noformat}
Benchmark                            (readWriteRatio)  (type)  (warmup)   Mode  Samples  
  Score  Score error   Units
b.b.c.HashOrderedCollections.test                 0.9    CSLM   1000000  thrpt       10  2283.934
     157.719  ops/ms
b.b.c.HashOrderedCollections.test                 0.9   NBHOM   1000000  thrpt       10  8850.066
     147.894  ops/ms
b.b.c.HashOrderedCollections.test                 0.5    CSLM   1000000  thrpt       10  1960.077
     145.752  ops/ms
b.b.c.HashOrderedCollections.test                 0.5   NBHOM   1000000  thrpt       10  5637.813
     688.394  ops/ms
b.b.c.HashOrderedCollections.test                 0.1    CSLM   1000000  thrpt       10  
706.284      162.845  ops/ms
b.b.c.HashOrderedCollections.test                 0.1   NBHOM   1000000  thrpt       10  3270.920
    1545.698  ops/ms
b.b.c.HashOrderedCollections.test                   0    CSLM   1000000  thrpt       10  1689.157
     176.412  ops/ms
b.b.c.HashOrderedCollections.test                   0   NBHOM   1000000  thrpt       10  2737.195
    1042.289  ops/ms
{noformat}



was (Author: benedict):
Some more numbers, with a warmup dataset to populate the map so that variability due to throughput
rate is reduced. These numbers show the NBHOM consistently around 3x+ faster, although it
introduces per-run variability due to GC.

{noformat}
Benchmark                            (readWriteRatio)  (type)  (warmup)   Mode  Samples  
  Score  Score error   Units
b.b.c.HashOrderedCollections.test                 0.9    CSLM   5000000  thrpt        5  1392.273
    2918.717  ops/ms
b.b.c.HashOrderedCollections.test                 0.9   NBHOM   5000000  thrpt        5  5088.408
    8964.885  ops/ms
b.b.c.HashOrderedCollections.test                 0.5    CSLM   5000000  thrpt        5  1128.637
    2589.679  ops/ms
b.b.c.HashOrderedCollections.test                 0.5   NBHOM   5000000  thrpt        5  3406.299
    5606.085  ops/ms
b.b.c.HashOrderedCollections.test                 0.1    CSLM   5000000  thrpt        5  
924.642     1802.045  ops/ms
b.b.c.HashOrderedCollections.test                 0.1   NBHOM   5000000  thrpt        5  3311.107
     999.896  ops/ms
b.b.c.HashOrderedCollections.test                   0    CSLM   5000000  thrpt        5  
939.757     1776.641  ops/ms
b.b.c.HashOrderedCollections.test                   0   NBHOM   5000000  thrpt        5  2781.503
    4723.844  ops/ms
{noformat}

> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: profile.yaml, reads.svg, run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in our
memtables. Maintaining this is an O(lg(n)) operation; since the vast majority of users use
a hash partitioner, it occurs to me we could maintain a hybrid ordered list / hash map. The
list would impose the normal order on the collection, but a hash index would live alongside
as part of the same data structure, simply mapping into the list and permitting O(1) lookups
and inserts.
> I've chosen to implement this initial version as a linked-list node per item, but we
can optimise this in future by storing fatter nodes that permit a cache-line's worth of hashes
to be checked at once,  further reducing the constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message