cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1470) use direct io for compaction
Date Wed, 13 Oct 2010 19:31:34 GMT


Peter Schuller commented on CASSANDRA-1470:

On the commit log direct I/O: The buffering is currently limited to 128k, which I would expect
to some extent negate use of periodic sync mode given that for cases where people do periodic
sync in a write-heavy environment, they probably don't want direct I/O (effectively fsync()
in terms of performance characteristics) every 128k.

It would also be detrimental for row mutations that are say > 50k since the probability
of hitting disk more than once for a single row mutation commit would be high.

If my understanding is correct, some possible suggestions:

(1) up commit log buffer size significantly to mitigate the problem, or even to the extent
that an entire commit log segment is kept in ram (also has negative effects on the latency
once you *do* sync)

(2) only enable direct I/O when in batched mode (not very useful)

(3) actually prefer posix_fadvise() in this case (contrary to the sstable/compaction case).

Other than the extra effort, (3) is probably cleanest (by my initial feeling) in this particular
case since the commit log maps directly to the actual functionality implemented by DONT_NEED
with none of the drawbacks talked about above that applied to the compaction case.

> use direct io for compaction
> ----------------------------
>                 Key: CASSANDRA-1470
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 0.6.7, 0.7.1
>         Attachments: 1470-v2.txt, 1470.txt, use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch
> When compaction scans through a group of sstables, it forces the data in the os buffer
cache being used for hot reads, which can have a dramatic negative effect on performance.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message