cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrés de la Peña (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12245) initial view build can be parallel
Date Sat, 18 Nov 2017 12:48:01 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258054#comment-16258054
] 

Andrés de la Peña commented on CASSANDRA-12245:
-----------------------------------------------

Thanks for the comments, this is almost finished :)

[Here|https://github.com/apache/cassandra/compare/trunk...adelapena:12245-trunk] is the new
version of the patch, rebased and squashed. The udpated dtests can be found [here|https://github.com/apache/cassandra-dtest/compare/master...adelapena:12245].

bq. One minor thing is that we should probably only split the view build tasks at all if the
base table is larger than a given size (let's say 500MB or so?), to avoid 4 * num_processor
flushes for base tables with negligible size, WDYT?

As discussed, I have moved the base table flush from {{ViewBuilderTask}} to {{ViewBuilder}}
[here|https://github.com/adelapena/cassandra/commit/478ed88b490378caf4f8ddc82c8e3aa3f90e5264],
to do a single flush at the begining of the view build. The following writes will be writen
to the MV through the regular path so it seems that they won't need any further flushes. I
think that with this we don't need to check the table size and give a special treatment to
small ones, what do you think?

bq. I noticed we don't stop in-progess view builds when a view is removed, would you mind
adding that?

Right, good catch. Done [here|https://github.com/adelapena/cassandra/commit/e1ace2f47be71d48ab1987d0e2c7a07cc9486e97].
I have also added [this dtest|https://github.com/adelapena/cassandra-dtest/blob/12245/materialized_views_test.py#L1025-L1067]
to verify that the build is properly stopped.

bq. ViewBuildExecutor is being constructed with minThreads=1 and maxPoolSize=concurrent_materialized_view_builders,
but according to the {{DebuggableThreadPoolExecutor}}'s' [javadoc|https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/concurrent/DebuggableThreadPoolExecutor.java#L33],
this will actually make the executor with size 1 since maxPoolSize is not supported by {{DebuggableThreadPoolExecutor}}
- and even if it were, new threads would only be created after the queue of the initial threads
were full (which is quite unintuitive), but we actually want the pool to have concurrent_materialized_view_builders
concurrent threads at most, so we should use the {{threadCount}} constructor instead - at
some point we should actually remove the maximumPoolSize

Done [here|https://github.com/adelapena/cassandra/commit/fc14b034bb5d36c23435f313541445dc5adb0078].

bq. I think we could take a {{buildAllViews}} parameter on reload, and set that to false during
Keyspace initialization, since views will be build during daemon initialization and keyspace
changes anyway, WDYT?

Makes sense, done [here|https://github.com/adelapena/cassandra/commit/c4f19a5461434c0d5ca5e1301d92da26cca5083e].

bq. One last thing, can you please add the new yaml option {{concurrent_materialized_view_builders}}
to the configuration section of the doc?

It seems that [the configuration section|https://github.com/apache/cassandra/blob/trunk/doc/source/configuration/index.rst]
of the doc is currently empty. I think that writting this section (structure, introduction,
etc.) is probably out of the scope of this ticket and it might be done in a separate, dedicated
ticket. Instead, I have [updated|https://github.com/adelapena/cassandra/commit/30293f852584189a5b46c2dce5ae4042ae62d3e4]
the NEWS.txt file with more detailed info and I have added [a note|https://github.com/adelapena/cassandra/commit/82c446398d0b6b4b1b13b35b3502489fc71fe703]
to the doc about {{CREATE MATERIALIZED VIEW}} statement. WDYT?

I have updated the dtest {{interrupt_build_process_test}} to make sure that the build is really
interrupted also in 3.x through [new byteman scripts|https://github.com/adelapena/cassandra-dtest/blob/f7aac39ee5d0c661b2f7f5b1db2a7347635f85c5/materialized_views_test.py#L962-L963].
Without that, the build could finish before the cluster stop.

The CI results look good, at least for MVs:
||[utest|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/adelapena-12245-trunk-testall/]||[dtest|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/adelapena-12245-trunk-dtest/]||

> initial view build can be parallel
> ----------------------------------
>
>                 Key: CASSANDRA-12245
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12245
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Materialized Views
>            Reporter: Tom van der Woerdt
>            Assignee: Andrés de la Peña
>             Fix For: 4.x
>
>
> On a node with lots of data (~3TB) building a materialized view takes several weeks,
which is not ideal. It's doing this in a single thread.
> There are several potential ways this can be optimized :
>  * do vnodes in parallel, instead of going through the entire range in one thread
>  * just iterate through sstables, not worrying about duplicates, and include the timestamp
of the original write in the MV mutation. since this doesn't exclude duplicates it does increase
the amount of work and could temporarily surface ghost rows (yikes) but I guess that's why
they call it eventual consistency. doing it this way can avoid holding references to all tables
on disk, allows parallelization, and removes the need to check other sstables for existing
data. this is essentially the 'do a full repair' path



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message