cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dikang Gu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13474) Cassandra pluggable storage engine
Date Tue, 06 Mar 2018 16:28:00 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dikang Gu updated CASSANDRA-13474:
----------------------------------
    Description: 
Instagram is working on a project to significantly reduce Cassandra's tail latency, by implementing
a new storage engine on top of RocksDB, named Rocksandra.

We started a prototype of single column (key-value) use case, and then implemented a full
design to support most of the data types and data models in Cassandra, as well as streaming.

After a year of development and testing, we have rolled out the Rocksandra project to our
internal deployments, and observed 3-4X reduction on P99 read latency in general, even more
than 10 times reduction for some use cases.

We published a blog post about the wins and the benchmark metrics on AWS environment. https://engineering.instagram.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589

I think the biggest performance win comes from we get rid of most Java garbages created by
current read/write path and compactions, which reduces the JVM overhead and makes the latency
to be more predictable.

We are very excited about the potential performance gain. As the next step, I propose to make
the Cassandra storage engine to be pluggable (like Mysql and MongoDB), and we are very interested
in providing RocksDB as one storage option with more predictable performance, together with
community.

Design doc for pluggable storage engine: https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc/edit

  was:
We did some experiment to switch Cassandra's storage engine to RocksDB.

In the experiment, I built a prototype to integrate Cassandra 3.0.12 and RocksDB on single
column (key-value) use case, shadowed one of our production use case, and saw about 4-6X P99
read latency drop during peak time, compared to 3.0.12. Also, the P99 latency became more
predictable as well.

Here is detailed note with more metrics:

[https://docs.google.com/document/d/1Ztqcu8Jzh4USKoWBgDJQw82DBurQmsV-PmfiJYvu_Dc/edit?usp=sharing]

I think the biggest latency win comes from we get rid of most Java garbages created by current
read/write path and compactions, which reduces the JVM overhead and makes the latency to be
more predictable.

We are very excited about the potential performance gain. As the next step, I propose to make
the Cassandra storage engine to be pluggable (like Mysql and MongoDB), and we are very interested
in providing RocksDB as one storage option with more predictable performance, together with
community.

Design doc for pluggable storage engine: https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc/edit


> Cassandra pluggable storage engine
> ----------------------------------
>
>                 Key: CASSANDRA-13474
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13474
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Dikang Gu
>            Priority: Major
>
> Instagram is working on a project to significantly reduce Cassandra's tail latency, by
implementing a new storage engine on top of RocksDB, named Rocksandra.
> We started a prototype of single column (key-value) use case, and then implemented a
full design to support most of the data types and data models in Cassandra, as well as streaming.
> After a year of development and testing, we have rolled out the Rocksandra project to
our internal deployments, and observed 3-4X reduction on P99 read latency in general, even
more than 10 times reduction for some use cases.
> We published a blog post about the wins and the benchmark metrics on AWS environment.
https://engineering.instagram.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589
> I think the biggest performance win comes from we get rid of most Java garbages created
by current read/write path and compactions, which reduces the JVM overhead and makes the latency
to be more predictable.
> We are very excited about the potential performance gain. As the next step, I propose
to make the Cassandra storage engine to be pluggable (like Mysql and MongoDB), and we are
very interested in providing RocksDB as one storage option with more predictable performance,
together with community.
> Design doc for pluggable storage engine: https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc/edit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message