cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10681) make index building pluggable via IndexBuildTask
Date Fri, 20 Nov 2015 12:41:11 GMT


Sam Tunnicliffe commented on CASSANDRA-10681:

My solution is almost identical to your first one, the only real difference being to group
the tasks by the class of the task itself, rather than by the index class. That way, all existing
index implementations can continue to share a single task without any modifications. To avoid
constructing some task instances which are immediately discarded, I did consider having the
new method on {{Index}} return the class of the build task instead of an instance and handling
construction via reflection, but the added complexity doesn't seem worth it right now. Patch
on top of your initial commit [here|]

> make index building pluggable via IndexBuildTask
> ------------------------------------------------
>                 Key: CASSANDRA-10681
>                 URL:
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local Write-Read Paths
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>            Priority: Minor
>              Labels: sasi
>             Fix For: 3.x
>         Attachments: 0001-add-table-support-for-multi-table-builds.patch, 0001-make-index-building-pluggable-via-IndexBuildTask.patch
> Currently index building assumes one and only way to build all of the indexes - through
SecondaryIndexBuilder - which merges all of the sstables together, collates columns etc. Such
works fine for built-in indexes but not for SASI since it's attaches to every SSTable individually.
We need a "IndexBuildTask" interface (based on CompactionInfo.Holder) to be returned from
Index on demand to give power to SI interface implementers to decide how build should work.
This might be less effective for CassandraIndex, since this effectively means that collation
will have to be done multiple times on the same data, but  nevertheless is a good compromise
for clean interface to outside world.

This message was sent by Atlassian JIRA

View raw message