cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)
Date Wed, 19 Aug 2015 19:16:47 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703597#comment-14703597
] 

Ariel Weisberg commented on CASSANDRA-8630:
-------------------------------------------

Yes you should rebase to 3.0. We port changes forward (I learned this recently myself).

* MemoryInputStream.available() can wrap the addition between buffer.remaining() + Ints.saturatedCast(memRemaining()).
Do the addition and then the saturating cast.
* Why does RandomAccessReader accept a builder and a parameter for initializing the buffer?
Seems like we lose the bonus of a builder a builder allowing a constant signature.
* A nit in initializeBuffer, it does firstSegment.value().slice() which implies you want a
subset of the buffer? duplicate() makes it obvious there is no such concern.
* I think there is a place for unit tests stressing the 2 gigabyte boundaries. That means
testing available()/length()/remaining() style methods as well as being able to read and seek
with instances of these things that are larger than 2g. Doing it with the actual file based
ones seems bad, but maybe you could intercept those to work with memory so they run fast or
ingloriously mock their inputs.
* For rate limiting is your current solution to consume buffer size bytes from the limiter
at a time for both mmap reads and standard? And you accomplish this by slicing the buffer
then updating the position? I don't see you setting the limit before slicing?
* I thought NIODataInputStream had methods for reading into ByteBuffers, but was wrong. It's
kind of thorny to add one to RebufferingInputStream so I think you did the right thing putting
it in FileSegmentedInputStream even though it's an odd concern to have in that class. Unless
you have a better idea.

Stefania is your rework of segment handling still in progress? IOW should I hold off until
you are done.

[~benedict] In what scenario would we not want to map the file with as few 2 gigabyte buffers
as possible?

I am still digesting the segments/boundaries/mapping issues.




> Faster sequential IO (on compaction, streaming, etc)
> ----------------------------------------------------
>
>                 Key: CASSANDRA-8630
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>            Reporter: Oleg Anastasyev
>            Assignee: Stefania
>              Labels: compaction, performance
>             Fix For: 3.x
>
>         Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, flight_recorder_001_files.tar.gz,
flight_recorder_002_files.tar.gz, mmaped_uncomp_hotspot.png
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot of CPU is
lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as their matching
write* are implemented with numerous calls of byte by byte read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in either way gives
8x speed increase.
> A patch attached implements RandomAccessReader.read<Type> and SequencialWriter.write<Type>
methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and ColumnNameHelper.maxComponents,
which were on my profiler's hotspot method list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% faster  on uncompressed
sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU load - i.e.
compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message