cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrés de la Peña (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10130) Node failure during 2i update after streaming can have incomplete 2i when restarted
Date Wed, 17 May 2017 13:19:04 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014004#comment-16014004
] 

Andrés de la Peña commented on CASSANDRA-10130:
-----------------------------------------------

bq. do we really need a {{PendingIndexBuildsCounter}} class? I thought it would be simpler
to make {{pendingBuilds}} a {{Map<String, AtomicInteger>}} and keep the inc/dec logic
on {{markIndexBuilding}} and {{markIndexBuilt}} instead, which are the only consumers of this
class.
Probably not. The {{AtomicInteger}} s are going to be manipulated inside a {{synchronized}}
block, so a not thread-safe mutable integer should be enough. This is what led me to build
the {{PendingIndexBuildsCounter}} class. But is true that it is easier to just use a {{Map<String,
AtomicInteger>}} and keep the logic inside the markIndex* methods.

bq. I'm kinda unsure if we should actually call {{markIndexBuilding}} inside {{createIndex}}:
what if the user forgets to call {{markIndexBuilt}} inside the initialization task? There's
no such contract forcing the user to do that. So, as a corollary, we should probably accept
calls to {{markIndexBuilt}} even without a previous {{markIndexBuilding}} call (that is, making
it idempotent).
It is true that an implementation could forget to call {{markIndexBuilding}} or {{markIndexBuilt}}.
But I think that if we relax the counters system then we could have the race conditions that
we try to solve. I mean, an effective call to {{markIndexBuilt}} (marking) without a preceding
{{markIndexBuilding}} could spoil the efforts of a proper {{markIndexBuilding}}/{{markIndexBuilt}}
pair usage. The approach suggested by [~pauloricardomg] could help. Also, as it is pointed,
there is no contract forcing the users to use any of the {{markIndex*}} methods, so it's hard
to anticipate all possible scenarios to avoid race conditions.

As, an alternative approach, we could hide all the {{markIndex*}} and let the {{SecondaryIndexManager}}
to internally manage it. This should avoid the risk of index implementations or other components
making calls out of order. Here is a patch showing the approach:

||[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:10130-trunk]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-10130-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-10130-trunk-dtest/]|

It adds a {{Runnable}} argument to {{buildAllIndexesBlocking}} to provide the ability of retrying
failed new sstables indexing guaranteeing that the hidden {{markIndex*}} methods are properly
called. 

The index implementations are not responsible anymore of marking the index as built. This
eliminates the bidirectional dependency between {{SecondaryIndexManager}} and the index implementations.
We might consider updating the log messages produced by {{CassandraIndex.buildBlocking}} and
{{CustomCassandraIndex.buildBlocking}}, or even moving them to the {{SecondaryIndexManager}}.

What do you think? Does it make sense?

> Node failure during 2i update after streaming can have incomplete 2i when restarted
> -----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10130
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10130
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Coordination
>            Reporter: Yuki Morishita
>            Assignee: Andrés de la Peña
>            Priority: Minor
>
> Since MV/2i update happens after SSTables are received, node failure during MV/2i update
can leave received SSTables live when restarted while MV/2i are partially up to date.
> We can add some kind of tracking mechanism to automatically rebuild at the startup, or
at least warn user when the node restarts.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message