cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-4937) CRAR improvements (object cache + CompressionMetadata chunk offset storage moved off-heap).
Date Fri, 09 Nov 2012 19:00:14 GMT


Pavel Yaskevich commented on CASSANDRA-4937:

bq. How much memory do you see CM using, prior to this change? 
bq. I'm not sure how much can CRAR "objectCache" help if we trash the queue on reference release.
This would be more useful if we (1) made it per-SSTR instead of global and (2) allowed CRAR
to be used by more than one request. Either way, let's split this out to a separate patch
– think this should be 1.2-only.

It depends on the data size per-SSTable, but on each 1GB overhead is about 20MB, which would
definitely be promoted to old gen and kept there. That is not the biggest problem, the problem
with compression allocation rate is that we are opening one CRAR per row read (CompressedSegmentedFile.getSegment)
which allocates 128KB (64KB for compression/decompression buffers) plus has additional small
memory overhead of checksum buffer and fields, that is what "objectCache" here to solve because
it was figured that even when up to 12 SSTables are involved per read we wouldn't have more
then "concurrent_reads" CRARs per-SSTable in the cache.

bq. RAR skip cache harms performance in situation when reads are done in parallel with compaction

The problem with the code we have right now is that it doesn't actually skip blocks that compaction
read, it drops *whole* file page cache after fixed internal (128MB) so when you have long
running compactions (trottled or big data files, for example) normal reads would hit the cold
data very frequently. The only thing is working correctly right now in terms of skiping cache
is SW. I have done some benchmarks with and without skiping cache and it shows that page replacement
done by kernel is much better then our skip suggestions via recent read latency histograms.

bq. CM.close is added but I don't see it called anywhere

It should be called on CRAR.deallocate(), sorry, I must have missed that when I was merging,
I will fix that and update the patch.

bq. If we're going to move CompressionMetadata off-heap, why not use a single Memory object
instead of BLA? This should also be 1.2.

Yes, that could be done as a single object, I just didn't want to remove historical paging
change especially actual overhead of paging is very-very low :)

> CRAR improvements (object cache + CompressionMetadata chunk offset storage moved off-heap).
> -------------------------------------------------------------------------------------------
>                 Key: CASSANDRA-4937
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.1.6
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>            Priority: Minor
>             Fix For: 1.1.7
>         Attachments: CASSANDRA-4937.patch
> After good amount of testing on one of the clusters it was found that in order to improve
read latency we need to minimize allocation rate that compression involves, that minimizes
GC (as well as heap usage) and substantially decreases latency on read heavy workloads. 
> I have also discovered that RAR skip cache harms performance in situation when reads
are done in parallel with compaction working with relatively big SSTable files (few GB and
more). The attached patch removes possibility to skip cache from compressed files (I can also
add changes to RAR to remove skip cache functionality as a separate patch). 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message