cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mck (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-8798) don't throw TombstoneOverwhelmingException during bootstrap
Date Fri, 13 Nov 2015 23:30:11 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004904#comment-15004904
] 

mck edited comment on CASSANDRA-8798 at 11/13/15 11:29 PM:
-----------------------------------------------------------

Looking through [~jjirsa]'s patch what strikes me is we're getting tangled up through the
code, ie there's little IoC. to what really is just the "settings" c* is currently running
against.

[~iamaleksey], might C* benefit from a delegation model over DatabaseDescriptor/Config?

That way during bootstrap DatabaseDescriptor can delegate to a Config where settings like
tombstone_failure_threshold is overridden to Integer.MAX_VALUE.

That way the code involves more DatabaseDescriptor and some notion of the current state, eg
"bootstrapping", kinda what {{respectTombstoneThresholds()}} attempts to do but at a higher
level. While i'm thinking out loud here (and haven't really validated my thoughts through
the code, for example it requires that classes don't copy DatabaseDescriptor fields) I wonder
if there are other states C* finds itself in that warrant similar temporary alterations to
the Config settings.


was (Author: michaelsembwever):
Looking through [~jjirsa]'s patch what strikes me if we're getting tangled up through the
code, ie there's little IoC. to what really is just the "settings" c* is currently running
against.

[~iamaleksey], might C* benefit from a delegation model over DatabaseDescriptor/Config?

That way during bootstrap DatabaseDescriptor can delegate to a Config where settings like
tombstone_failure_threshold is overridden to Integer.MAX_VALUE.

That way the code involves more DatabaseDescriptor and some notion of the current state, eg
"bootstrapping", kinda what {{respectTombstoneThresholds()}} attempts to do but at a higher
level. While i'm thinking out loud here (and haven't really validated my thoughts through
the code, for example it requires that classes don't copy DatabaseDescriptor fields) I wonder
if there are other states C* finds itself in that warrant similar temporary alterations to
the Config settings.

> don't throw TombstoneOverwhelmingException during bootstrap
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-8798
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8798
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: mck
>         Attachments: 8798.txt
>
>
> During bootstrap honouring tombstone_failure_threshold seems counter-productive as the
node is not serving requests so not protecting anything.
> Instead what happens is bootstrap fails, and a cluster that obviously needs an extra
node isn't getting it...
> **History**
> When adding a new node bootstrap process looks complete in that streaming is finished,
compactions finished, and all disk and cpu activity is calm.
> But the node is still stuck in "joining" status. 
> The last stage in the bootstrapping process is the rebuilding of secondary indexes. grepping
the logs confirmed it failed during this stage.
> {code}grep SecondaryIndexManager cassandra/logs/*{code}
> To see what secondary index rebuilding was initiated
> {code}
> grep "index build of " cassandra/logs/* | awk -F" for data in " '{print $1}'
> INFO 13:18:11,252 Submitting index build of addresses.unobfuscatedIndex
> INFO 13:18:11,352 Submitting index build of Inbox.FINNBOXID_INDEX
> INFO 23:03:54,758 Submitting index build of [events.collected_tbIndex, events.real_tbIndex]
> {code}
> To get an idea of successful secondary index rebuilding 
> {code}grep "Index build of "cassandra/logs/*
> INFO 13:18:11,263 Index build of addresses.unobfuscatedIndex complete
> INFO 13:18:11,355 Index build of Inbox.FINNBOXID_INDEX complete
> {code}
> Looking closer at  {{[events.collected_tbIndex, events.real_tbIndex]}} showed the following
stacktrace
> {code}
> ERROR [StreamReceiveTask:121] 2015-02-12 05:54:47,768 CassandraDaemon.java (line 199)
Exception in thread Thread[StreamReceiveTask:121,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException:
org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>         at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:413)
>         at org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:142)
>         at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:130)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>         at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:409)
>         ... 7 more
> Caused by: java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>         at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:160)
>         at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:143)
>         at org.apache.cassandra.db.Keyspace.indexRow(Keyspace.java:406)
>         at org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
>         at org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:834)
>         ... 5 more
> Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
>         at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202)
>         at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
>         at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
>         at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
>         at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
>         at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
>         at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
>         at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.service.pager.SliceQueryPager.queryNextPage(SliceQueryPager.java:85)
>         at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:88)
>         at org.apache.cassandra.service.pager.SliceQueryPager.fetchPage(SliceQueryPager.java:35)
>         at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:154)
>         ... 9 more
> {code}
> To get past this i had to raise org.apache.cassandra.db:type=StorageService.TombstoneFailureThreshold
and manually rebuild the index. Then restart the node with auto_bootstrap=false



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message