cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7031) Increase default commit log total space + segment size
Date Tue, 22 Apr 2014 10:41:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976624#comment-13976624
] 

Benedict commented on CASSANDRA-7031:
-------------------------------------

>From what POV is 128Mb a long gap between archived segments? Do we mean that there may
be a 128Mb gap after the most recent archive during which no PIT restore is possible? Seems
like this would be a minimal problem, as the most recent CLS is still present in the CL directory,
and we could always offer the ability to create a PITR point through force recycling the current
CL segment at the requested time to make sure there is a separate backup. If you care about
rolling PITR backups with minimal intervals then you're probably a very specific use case,
I'd reckon.

As far as replay is concerned, I don't see a major difference: we need to read ahead potentially
more than even one 128Mb file to check if there are delayed commits, and either way 128Mb
is a very small amount of data - a few seconds at most of extra restore time.

> Increase default commit log total space + segment size
> ------------------------------------------------------
>
>                 Key: CASSANDRA-7031
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7031
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Trivial
>             Fix For: 2.1 beta2
>
>         Attachments: 7031.txt
>
>
> I would like to increase the default commit log total space and segment size options
for 64-bit JVMs:
> The current default of 1Gb and 32Mb is quite constrained and can have some (very minor)
negative performance implications, for no major benefit: 
> # 32Mb files are actually quite small, and if during the 10s interval we have completely
filled multiple of them (quite easy) it would be more efficient to write fewer larger files,
as we can issue fewer fsyncs and permit the OS to schedule the writes more efficiently. On
my box this has a small but noticeable impact. Although I would expect on decent server hardware
this would be smaller still, since we immediately drop the pages from cache on writing there
isn't a great deal of advantage to keeping the files so small. The only advantage I can see
is that during a drop KS/CF or other event that forces log rollover we're wasting less space
until log recycling. 128-256Mb are modest increases that seem more appropriate to me.
> # 1Gb is too small for the default total log space. We can find that we force memtable
flushes as a result of log utilisation instead of memtable occupancy quite often (esp. as
a result of increased effective memtable space from recent improvements), especially on machines
with more addressable memory. I suggest 8Gb as a minimum. The only disadvantage of having
more log data is that replay on restart may be slightly slower, but since most of the events
will be ignored it should be relatively benign, and I would rather take the penalty on startup
instead of during running, no matter how small the running penalty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message