cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dikang Gu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13475) First version of pluggable storage engine API.
Date Tue, 31 Oct 2017 06:47:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226344#comment-16226344
] 

Dikang Gu commented on CASSANDRA-13475:
---------------------------------------

[~bdeggleston], I see what you mean, but my point is that, ColumnFamilyStore is a very complicated
class, it is leaking the storage details like the sstable concept in almost every function
it provides. In the future, all the call stacks deal with the APIs which leaks the storage
details should be moved to a CQLStorageEngine (or CQLColumnFamliyStore in your word). And
I'm not sure it's the top priority to try to clean up the ColumnFamilyStore at this moment.

The process in my mind is that:
1. We define the new API for common work load, which does not require a big refactor of Cassandra's
code yet, but can hide a new storage engine implementation. This is demonstrated in our RocksDBEngine
implementation.
2. Start to refactor/cleanup ColumnFamilyStore and Keyspace, which means we implement a CQLStorageEngine
and move the current storage related business into the CQLStorageEnigne. As you said 99.99%
of the work will be involved here. And according to our experience of implementing the RocksDBEngine,
we should be able to do it step by step, move things piece by piece.
3. I can image we will take a lot of iterations of step 1 & 2, keep refining the API and
cleaning up the CFS/Keyspace classes. At the end, I think CFS/Keyspace will become a thin
wrapper around the storage engine API. 

I don't think there are big differences between our proposals, even for the IColumnFamilyStore
interface, I can image it will be pretty similar to the StorageEngine interface I propose.
But I don't want to change everywhere to use IColumnFamilyStore interface at step 1, since
it requires so many refactoring work at once, and I tend to have many small patches instead
of one big patch for the refactoring. Also for testing purpose, I think small patches are
better and easier to have better test coverage.

What do you think?

> First version of pluggable storage engine API.
> ----------------------------------------------
>
>                 Key: CASSANDRA-13475
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13475
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Dikang Gu
>            Assignee: Dikang Gu
>
> In order to support pluggable storage engine, we need to define a unified interface/API,
which can allow us to plug in different storage engines for different requirements. 
> In very high level, the storage engine interface should include APIs to:
> 1. Apply update into the engine.
> 2. Query data from the engine.
> 3. Stream data in/out to/from the engine.
> 4. Table operations, like create/drop/truncate a table, etc.
> 5. Various stats about the engine.
> I create this ticket to start the discussions about the interface.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message