cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign
Date Fri, 21 Aug 2015 11:19:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706556#comment-14706556
] 

Sylvain Lebresne commented on CASSANDRA-9459:
---------------------------------------------

Alright, follows my comments for the rest of the patch review:
* Would be nice to move the calls of {{SecondaryIndexManager.validate}} out of {{UpdateParameters}}/{{CassandraServer}}
to a single place, like maybe {{Keyspace.apply}} so it's in only one place and we're sure
we cannot miss it on any path.
* I would remove {{ColumnIndexMetadata}}, inlining the fields directly into {{ColumnIndex}}
(with proper getters). I think the indirection makes things a bit harder to follow (and more
verbose) for no benefit I can see (typically we store the {{baseCfs}} twice, in both {{ColumnIndex}}
and {{ColumnIndexMetadata}}. Or {{CassandraIndexSearcher}} holds bothe the {{CassandraIndex}}
and {{CassandraIndexMetadata}} even though they are essentially the same thing). If you really
hate removing it (but I _really_ think it would be cleaner to do so), I think we should at
least rename it, cause it currently sounds like it's a specialisation of {{IndexMetadata}}
while it's not (and more generally it doesn't fit into what we call "Metadata" in general).
* Still not a fan of {{ColumnIndexFunctions}}. I think having subclasses of {{CassandraIndex}}
would be a lot more idiomatic (since this, to me, fits the exact definition of what inheritance
is about) and hence simpler (since more direct/expected). Would also be a tiny bit less verbose
since you won't have to basically pass the index to most function.
* To have {{CassandraIndexSearcher}} be an {{Index.Searcher}} but {{ColumnIndexSearcher}}
(and its subclasses) not be one is a bit surprising/inconsistent naming wise. And really,
having 2 separate class here feels more complexity than necessary (after all, {{CassandraIndex}}
is our "ColumnIndex" implementation so why 2 searcher class?). So I would just merge {{ColumnIndexSearcher}}
into {{CassandraIndexSearcher}} (having thus {{KeysSearcher}} and {{CompositesSearcher}} be
sublcasses of {{CassandraIndexSearcher}}).
* In {{ColumnFamilyStore.scrubDataDirectories}}, we basically use {{!index.isCustom()}} to
select index that have a backing CFS. Why not using SecondaryIndexManager.getAllIndexStorageTables()
instead, that's more consistent with how we do it in other places (I'll note you've mostly
just kept the way it was done before, but could still be nice to clean it imo).

I've also pushed [a commit|https://github.com/pcmanus/cassandra/commits/9459-nits] with a
few very minor nits that would have taken longer to explain that do.


> SecondaryIndex API redesign
> ---------------------------
>
>                 Key: CASSANDRA-9459
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9459
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sam Tunnicliffe
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.0 beta 1
>
>
> For some time now the index subsystem has been a pain point and in large part this is
due to the way that the APIs and principal classes have grown organically over the years.
It would be a good idea to conduct a wholesale review of the area and see if we can come up
with something a bit more coherent.
> A few starting points:
> * There's a lot in AbstractPerColumnSecondaryIndex & its subclasses which could be
pulled up into SecondaryIndexSearcher (note that to an extent, this is done in CASSANDRA-8099).
> * SecondayIndexManager is overly complex and several of its functions should be simplified/re-examined.
The handling of which columns are indexed and index selection on both the read and write paths
are somewhat dense and unintuitive.
> * The SecondaryIndex class hierarchy is rather convoluted and could use some serious
rework.
> There are a number of outstanding tickets which we should be able to roll into this higher
level one as subtasks (but I'll defer doing that until getting into the details of the redesign):
> * CASSANDRA-7771
> * CASSANDRA-8103
> * CASSANDRA-9041
> * CASSANDRA-4458
> * CASSANDRA-8505
> Whilst they're not hard dependencies, I propose that this be done on top of both CASSANDRA-8099
and CASSANDRA-6717. The former largely because the storage engine changes may facilitate a
friendlier index API, but also because of the changes to SIS mentioned above. As for 6717,
the changes to schema tables there will help facilitate CASSANDRA-7771.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message