cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6998) HintedHandoff - expired hints may block future hints deliveries
Date Mon, 12 May 2014 19:49:16 GMT


Jonathan Ellis commented on CASSANDRA-6998:

I see.

That would fix the problem, but I'm not really a fan of adding extra layers of complexity
to mask symptoms of a deeper underlying problem (CASSANDRA-6666).  Let's first figure out
*why* these tombstones aren't going away the way we expect them to; it's likely that we can
then solve the problem without breaking the SQF contract (and then trying to make up for that
at a higher level).

> HintedHandoff - expired hints may block future hints deliveries
> ---------------------------------------------------------------
>                 Key: CASSANDRA-6998
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: - cluster of two DCs: DC1, DC2
> - keyspace using NetworkTopologyStrategy (replication factors for both DCs)
> - heavy load (write:read, 100:1) with LOCAL_QUORUM using Java driver setup with DC awareness,
writing to DC1
>            Reporter: Scooletz
>              Labels: HintedHandoff, TTL
>         Attachments: 6998
> For tests purposes, DC2 was shut down for 1 day. The _hints_ table was filled with millions
of rows. Now, when _HintedHandOffManager_ tries to _doDeliverHintsToEndpoint_  it queries
the store with QueryFilter.getSliceFilter which counts deleted (TTLed) cells and throws org.apache.cassandra.db.filter.TombstoneOverwhelmingException.

> Throwing this exception stops the manager from running compaction as it is run only after
successful handoff. This leaves the HH practically disabled till administrator runs truncateAllHints.

> Wouldn't it be nicer if on org.apache.cassandra.db.filter.TombstoneOverwhelmingException
run compaction? That would remove TTLed hints leaving whole HH mechanism in a healthy state.
> The stacktrace is:
> {quote}
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> 	at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(
> 	at org.apache.cassandra.db.filter.QueryFilter.collateColumns(
> 	at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(
> 	at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(
> 	at org.apache.cassandra.db.CollationController.collectAllData(
> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(
> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
> 	at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(
> 	at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(
> 	at org.apache.cassandra.db.HintedHandOffManager.access$300(
> 	at org.apache.cassandra.db.HintedHandOffManager$
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(
> 	at java.util.concurrent.ThreadPoolExecutor$
> 	at
> {quote}

This message was sent by Atlassian JIRA

View raw message