cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrés de la Peña (JIRA) <>
Subject [jira] [Commented] (CASSANDRA-10130) Node failure during 2i update after streaming can have incomplete 2i when restarted
Date Wed, 17 May 2017 13:19:04 GMT


Andrés de la Peña commented on CASSANDRA-10130:

bq. do we really need a {{PendingIndexBuildsCounter}} class? I thought it would be simpler
to make {{pendingBuilds}} a {{Map<String, AtomicInteger>}} and keep the inc/dec logic
on {{markIndexBuilding}} and {{markIndexBuilt}} instead, which are the only consumers of this
Probably not. The {{AtomicInteger}} s are going to be manipulated inside a {{synchronized}}
block, so a not thread-safe mutable integer should be enough. This is what led me to build
the {{PendingIndexBuildsCounter}} class. But is true that it is easier to just use a {{Map<String,
AtomicInteger>}} and keep the logic inside the markIndex* methods.

bq. I'm kinda unsure if we should actually call {{markIndexBuilding}} inside {{createIndex}}:
what if the user forgets to call {{markIndexBuilt}} inside the initialization task? There's
no such contract forcing the user to do that. So, as a corollary, we should probably accept
calls to {{markIndexBuilt}} even without a previous {{markIndexBuilding}} call (that is, making
it idempotent).
It is true that an implementation could forget to call {{markIndexBuilding}} or {{markIndexBuilt}}.
But I think that if we relax the counters system then we could have the race conditions that
we try to solve. I mean, an effective call to {{markIndexBuilt}} (marking) without a preceding
{{markIndexBuilding}} could spoil the efforts of a proper {{markIndexBuilding}}/{{markIndexBuilt}}
pair usage. The approach suggested by [~pauloricardomg] could help. Also, as it is pointed,
there is no contract forcing the users to use any of the {{markIndex*}} methods, so it's hard
to anticipate all possible scenarios to avoid race conditions.

As, an alternative approach, we could hide all the {{markIndex*}} and let the {{SecondaryIndexManager}}
to internally manage it. This should avoid the risk of index implementations or other components
making calls out of order. Here is a patch showing the approach:


It adds a {{Runnable}} argument to {{buildAllIndexesBlocking}} to provide the ability of retrying
failed new sstables indexing guaranteeing that the hidden {{markIndex*}} methods are properly

The index implementations are not responsible anymore of marking the index as built. This
eliminates the bidirectional dependency between {{SecondaryIndexManager}} and the index implementations.
We might consider updating the log messages produced by {{CassandraIndex.buildBlocking}} and
{{CustomCassandraIndex.buildBlocking}}, or even moving them to the {{SecondaryIndexManager}}.

What do you think? Does it make sense?

> Node failure during 2i update after streaming can have incomplete 2i when restarted
> -----------------------------------------------------------------------------------
>                 Key: CASSANDRA-10130
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Coordination
>            Reporter: Yuki Morishita
>            Assignee: Andrés de la Peña
>            Priority: Minor
> Since MV/2i update happens after SSTables are received, node failure during MV/2i update
can leave received SSTables live when restarted while MV/2i are partially up to date.
> We can add some kind of tracking mechanism to automatically rebuild at the startup, or
at least warn user when the node restarts.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message