cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13038) 33% of compaction time spent in StreamingHistogram.update()
Date Wed, 08 Feb 2017 17:07:41 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858245#comment-15858245
] 

Jeff Jirsa commented on CASSANDRA-13038:
----------------------------------------

[~slebresne] - my main objection is that I know people (not me) are using 5 minute TWCS windows,
and presumably, they're doing so because they need to expire in 5 minute chunks. Rounding
up to 1 hour TTL histograms would eliminate that use case, and it's not a hypothetical use
case. There is a lower bound where it becomes much less useful (maybe it's 5 minutes, or 1
minute), but it's not 1 hour. 

In any case, I'll take the ticket. I had implemented one method using a larger temporary spool
of bins that we merge/compact down into the other set on use, and it's about 3-5x faster without
further loss of resolution. I'll extend that to add 60 second rounding and see what that does
- I suspect it'll be significant, especially in the cases where we're dealing with a day (or
week, or month) worth of data inserted with a default TTL.  Depending on how that looks, we
can decide if 60s or 300s is the right rounding point. 

> 33% of compaction time spent in StreamingHistogram.update()
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-13038
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13038
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Corentin Chary
>            Assignee: Jeff Jirsa
>         Attachments: compaction-speedup.patch, compaction-streaminghistrogram.png, profiler-snapshot.nps
>
>
> With the following table, that contains a *lot* of cells: 
> {code}
> CREATE TABLE biggraphite.datapoints_11520p_60s (
>     metric uuid,
>     time_start_ms bigint,
>     offset smallint,
>     count int,
>     value double,
>     PRIMARY KEY ((metric, time_start_ms), offset)
> ) WITH CLUSTERING ORDER BY (offset DESC);
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
'compaction_window_size': '6', 'compaction_window_unit': 'HOURS', 'max_threshold': '32', 'min_threshold':
'6'}
> Keyspace : biggraphite
>         Read Count: 1822
>         Read Latency: 1.8870054884742042 ms.
>         Write Count: 2212271647
>         Write Latency: 0.027705127678653473 ms.
>         Pending Flushes: 0
>                 Table: datapoints_11520p_60s
>                 SSTable count: 47
>                 Space used (live): 300417555945
>                 Space used (total): 303147395017
>                 Space used by snapshots (total): 0
>                 Off heap memory used (total): 207453042
>                 SSTable Compression Ratio: 0.4955200053039823
>                 Number of keys (estimate): 16343723
>                 Memtable cell count: 220576
>                 Memtable data size: 17115128
>                 Memtable off heap memory used: 0
>                 Memtable switch count: 2872
>                 Local read count: 0
>                 Local read latency: NaN ms
>                 Local write count: 1103167888
>                 Local write latency: 0.025 ms
>                 Pending flushes: 0
>                 Percent repaired: 0.0
>                 Bloom filter false positives: 0
>                 Bloom filter false ratio: 0.00000
>                 Bloom filter space used: 105118296
>                 Bloom filter off heap memory used: 106547192
>                 Index summary off heap memory used: 27730962
>                 Compression metadata off heap memory used: 73174888
>                 Compacted partition minimum bytes: 61
>                 Compacted partition maximum bytes: 51012
>                 Compacted partition mean bytes: 7899
>                 Average live cells per slice (last five minutes): NaN
>                 Maximum live cells per slice (last five minutes): 0
>                 Average tombstones per slice (last five minutes): NaN
>                 Maximum tombstones per slice (last five minutes): 0
>                 Dropped Mutations: 0
> {code}
> It looks like a good chunk of the compaction time is lost in StreamingHistogram.update()
(which is used to store the estimated tombstone drop times).
> This could be caused by a huge number of different deletion times which would makes the
bin huge but it this histogram should be capped to 100 keys. It's more likely caused by the
huge number of cells.
> A simple solutions could be to only take into accounts part of the cells, the fact the
this table has a TWCS also gives us an additional hint that sampling deletion times would
be fine.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message