Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2B77A10B2A for ; Sun, 15 Mar 2015 17:59:42 +0000 (UTC) Received: (qmail 40190 invoked by uid 500); 15 Mar 2015 17:59:42 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 40149 invoked by uid 500); 15 Mar 2015 17:59:42 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 40137 invoked by uid 99); 15 Mar 2015 17:59:41 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 Mar 2015 17:59:41 +0000 Date: Sun, 15 Mar 2015 17:59:41 +0000 (UTC) From: "Robert Stupp (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-8099?page=3Dcom.atlas= sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D= 14362475#comment-14362475 ]=20 Robert Stupp commented on CASSANDRA-8099: ----------------------------------------- As you say in [the doc|https://issues.apache.org/jira/browse/CASSANDRA-8971= ], naming of _atom_ is really bad and should be changed IMO. Some proposals= : * -{{Cluster}} - it's like that - but _cluster_ is an occupied term- * {{Line}} (slightly similar to _row_) * {{Assembly}} * or maybe just {{RawRow}} {{NamesPartitionFilter}} - not sure whether _names_ is a good word here. Pr= opose {{ClusteringPartitionFilter}} or {{ClusteredPartitionFilter}} Good idea to make {{CachePartition}} an interface! BTW: interesting to see that the term _Doppelg=C3=A4nger_ is known in Engli= sh ;) > Refactor and modernize the storage engine > ----------------------------------------- > > Key: CASSANDRA-8099 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8099 > Project: Cassandra > Issue Type: Improvement > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Fix For: 3.0 > > Attachments: 8099-nit > > > The current storage engine (which for this ticket I'll loosely define as = "the code implementing the read/write path") is suffering from old age. One= of the main problem is that the only structure it deals with is the cell, = which completely ignores the more high level CQL structure that groups cell= into (CQL) rows. > This leads to many inefficiencies, like the fact that during a reads we h= ave to group cells multiple times (to count on replica, then to count on th= e coordinator, then to produce the CQL resultset) because we forget about t= he grouping right away each time (so lots of useless cell names comparisons= in particular). But outside inefficiencies, having to manually recreate th= e CQL structure every time we need it for something is hindering new featur= es and makes the code more complex that it should be. > Said storage engine also has tons of technical debt. To pick an example, = the fact that during range queries we update {{SliceQueryFilter.count}} is = pretty hacky and error prone. Or the overly complex ways {{AbstractQueryPag= er}} has to go into to simply "remove the last query result". > So I want to bite the bullet and modernize this storage engine. I propose= to do 2 main things: > # Make the storage engine more aware of the CQL structure. In practice, i= nstead of having partitions be a simple iterable map of cells, it should be= an iterable list of row (each being itself composed of per-column cells, t= hough obviously not exactly the same kind of cell we have today). > # Make the engine more iterative. What I mean here is that in the read pa= th, we end up reading all cells in memory (we put them in a ColumnFamily ob= ject), but there is really no reason to. If instead we were working with it= erators all the way through, we could get to a point where we're basically = transferring data from disk to the network, and we should be able to reduce= GC substantially. > Please note that such refactor should provide some performance improvemen= ts right off the bat but it's not it's primary goal either. It's primary go= al is to simplify the storage engine and adds abstraction that are better s= uited to further optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)