cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
Date Tue, 16 Sep 2014 05:50:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135004#comment-14135004
] 

Benedict commented on CASSANDRA-7546:
-------------------------------------

1: that's great news :)
3: if you want lots of unique clustering key values per partition, currently stress has some
limitations and you will need/want to have multiple clustering columns for it to be able to
generate that smoothly without taking donkeys years per insert (on the workload generation
side). Its minimum unit of generation (not insert) is a single tier of clustering values,
so it would generate all 100B values each time you wanted to insert any number with your spec.

So, you want to consider a yaml like this:

{noformat}
table_definition: |
  CREATE TABLE testtable (
        p text,
        c1 int, c2 int, c3 int
        v blob,
        PRIMARY KEY(p, c1, c2, c3)
  ) WITH COMPACT STORAGE 
    AND compaction = { 'class':'LeveledCompactionStrategy' }
    AND comment='TestTable'

columnspec:
  - name: p
    size: fixed(16)
  - name: c1
    cluster: fixed(100)
  - name: c2
    cluster: fixed(100)
  - name: c3
    cluster: fixed(100)
  - name: v
    size: gaussian(50..250)
{noformat}

Then you want to consider passing -pop seq=1..1M -insert visits=fixed(1M) revisits=uniform(1..1024)

The visits parameter here tells stress to split each partition into 1M distinct inserts, which
given its deterministic 1M keys means exactly 1 item inserted each visit. The revisits distribution
defines the number of partition keys we will operate over until we exhaust one before selecting
another to include in our working set. 

Notice I've removed the population spec from your partition key in the yaml. This is because
it is not necessary to constrain it here, as you can constrain the _seed_ population with
the -pop parameter, which is the better way to do it here (so you can use the same yaml across
runs). However, in this case given our revisits() distribution we can also not constrain the
seed population, since once our first 1024 have been generated no other PK will be visited
until one of these has been fully exhausted (i.e. 1024 * 1M inserts, quite a few...). 

You may also constrain the seed to the same range, which once a key is exhausted would always
result in filling back in that key to the working set. It doesn't matter what distribution
you choose in this case, since it will keep generating a value until one not present in the
stash crops up, which if they operate over the same domain can only result in 1 item regardless
of distribution, so I suggest a sequential distribution to ensure determinism.

> AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>             Fix For: 2.1.1
>
>         Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 7546.20_4.txt, 7546.20_5.txt,
7546.20_6.txt, 7546.20_7.txt, 7546.20_7b.txt, 7546.20_alt.txt, 7546.20_async.txt, 7546.21_v1.txt,
hint_spikes.png, suggestion1.txt, suggestion1_21.txt, young_gen_gc.png
>
>
> In order to preserve atomicity, this code attempts to read, clone/update, then CAS the
state of the partition.
> Under heavy contention for updating a single partition this can cause some fairly staggering
memory growth (the more cores on your machine the worst it gets).
> Whilst many usage patterns don't do highly concurrent updates to the same partition,
hinting today, does, and in this case wild (order(s) of magnitude more than expected) memory
allocation rates can be seen (especially when the updates being hinted are small updates to
different partitions which can happen very fast on their own) - see CASSANDRA-7545
> It would be best to eliminate/reduce/limit the spinning memory allocation whilst not
slowing down the very common un-contended case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message