cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "T Jake Luciani (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-12269) Faster write path
Date Thu, 21 Jul 2016 21:01:20 GMT
T Jake Luciani created CASSANDRA-12269:
------------------------------------------

             Summary: Faster write path
                 Key: CASSANDRA-12269
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12269
             Project: Cassandra
          Issue Type: Improvement
            Reporter: T Jake Luciani
            Assignee: T Jake Luciani
             Fix For: 3.10


The new storage engine (CASSANDRA-8099) has caused a regression in write and read performance.
 This ticket is to address these to try and bring 3.0 as close to 2.2 as possible. There are
four main reasons for this I've discovered after much toil:

1.  The cost of calculating the size of a serialized row is higher now since we no longer
have the cell name and value managed as ByteBuffers as we did pre-3.0.  That means we current
re-serialize the row twice, once to calculate the size and once to write the data.  This happens
during the SSTable writes and was addressed in CASSANDRA-9766.
     Double serialization is also happening in CommitLog and the MessagingService.  We need
to apply the same techniques to these as we did to the SSTable serialization.

2.  Even after fixing (1) there is still an issue with there being more GC pressure and CPU
usage in 3.0 due to the fact that we encode everything from the {{Column}} to the {{Row}}
to the {{Partition}} as a {{BTree}}.  Specifically, the {{BTreeSearchIterator}} is used for
all iterator() methods.  Both these classes are useful for efficient removal and searching
of the trees but in the case of SerDe we almost always want to simply walk the entire tree
forwards or reversed and apply a function to each element.  To that end, we can use lambdas
and do this without any extra classes.

3.  We use a lot of thread locals and check them constantly on the read/write paths.  For
client warnings, tracing, temp buffers, etc.  We should move all thread locals to FastThreadLocals
and threads to FastThreadLocalThreads.

4.  We changed the memtable flusher defaults in 3.2 that caused a regression see: CASSANDRA-12228




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message