cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mck (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8798) don't throw TombstoneOverwhelmingException during bootstrap
Date Fri, 13 Nov 2015 23:29:11 GMT


mck commented on CASSANDRA-8798:

Looking through [~jjirsa]'s patch what strikes me if we're getting tangled up through the
code, ie there's little IoC. to what really is just the "settings" c* is currently running

Might C* benefit from a delegation model over DatabaseDescriptor/Config?

That way during bootstrap DatabaseDescriptor can delegate to a Config where settings like
tombstone_failure_threshold is overridden to Integer.MAX_VALUE.

That way the code involves more DatabaseDescriptor and some notion of the current state, eg
"bootstrapping", kinda what {{respectTombstoneThresholds()}} attempts to do but at a higher
level. While i'm thinking out loud here (and haven't really validated my thoughts through
the code, for example it requires that classes don't copy DatabaseDescriptor fields) I wonder
if there are other states C* finds itself in that warrant similar temporary alterations to
the Config settings.

> don't throw TombstoneOverwhelmingException during bootstrap
> -----------------------------------------------------------
>                 Key: CASSANDRA-8798
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: mck
>         Attachments: 8798.txt
> During bootstrap honouring tombstone_failure_threshold seems counter-productive as the
node is not serving requests so not protecting anything.
> Instead what happens is bootstrap fails, and a cluster that obviously needs an extra
node isn't getting it...
> **History**
> When adding a new node bootstrap process looks complete in that streaming is finished,
compactions finished, and all disk and cpu activity is calm.
> But the node is still stuck in "joining" status. 
> The last stage in the bootstrapping process is the rebuilding of secondary indexes. grepping
the logs confirmed it failed during this stage.
> {code}grep SecondaryIndexManager cassandra/logs/*{code}
> To see what secondary index rebuilding was initiated
> {code}
> grep "index build of " cassandra/logs/* | awk -F" for data in " '{print $1}'
> INFO 13:18:11,252 Submitting index build of addresses.unobfuscatedIndex
> INFO 13:18:11,352 Submitting index build of Inbox.FINNBOXID_INDEX
> INFO 23:03:54,758 Submitting index build of [events.collected_tbIndex, events.real_tbIndex]
> {code}
> To get an idea of successful secondary index rebuilding 
> {code}grep "Index build of "cassandra/logs/*
> INFO 13:18:11,263 Index build of addresses.unobfuscatedIndex complete
> INFO 13:18:11,355 Index build of Inbox.FINNBOXID_INDEX complete
> {code}
> Looking closer at  {{[events.collected_tbIndex, events.real_tbIndex]}} showed the following
> {code}
> ERROR [StreamReceiveTask:121] 2015-02-12 05:54:47,768 (line 199)
Exception in thread Thread[StreamReceiveTask:121,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException:
>         at org.apache.cassandra.utils.FBUtilities.waitOnFuture(
>         at org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(
>         at org.apache.cassandra.streaming.StreamReceiveTask$
>         at java.util.concurrent.Executors$
>         at
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
>         at java.util.concurrent.ThreadPoolExecutor$
>         at
> Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>         at
>         at java.util.concurrent.FutureTask.get(
>         at org.apache.cassandra.utils.FBUtilities.waitOnFuture(
>         ... 7 more
> Caused by: java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>         at org.apache.cassandra.service.pager.QueryPagers$
>         at org.apache.cassandra.service.pager.QueryPagers$
>         at org.apache.cassandra.db.Keyspace.indexRow(
>         at
>         at org.apache.cassandra.db.compaction.CompactionManager$
>         ... 5 more
> Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>         at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(
>         at org.apache.cassandra.db.filter.QueryFilter.collateColumns(
>         at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(
>         at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(
>         at org.apache.cassandra.db.CollationController.collectAllData(
>         at org.apache.cassandra.db.CollationController.getTopLevelColumns(
>         at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
>         at org.apache.cassandra.db.Keyspace.getRow(
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(
>         at org.apache.cassandra.service.pager.SliceQueryPager.queryNextPage(
>         at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(
>         at org.apache.cassandra.service.pager.SliceQueryPager.fetchPage(
>         at org.apache.cassandra.service.pager.QueryPagers$
>         ... 9 more
> {code}
> To get past this i had to raise org.apache.cassandra.db:type=StorageService.TombstoneFailureThreshold
and manually rebuild the index. Then restart the node with auto_bootstrap=false

This message was sent by Atlassian JIRA

View raw message