cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Muhammad Adel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6998) HintedHandoff - expired hints may block future hints deliveries
Date Mon, 14 Apr 2014 15:54:17 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968457#comment-13968457
] 

Muhammad Adel commented on CASSANDRA-6998:
------------------------------------------

Deleting undelivered tombstones can introduce problems with consistency. I don't think a hinted
handoff row should ever be deleted before delivery even if it was a tombstone.

The exception is thrown when the number of deleted columns in the query result exceed the
tombstonefailurethreshold.
This can be solved using in two ways:
1-Instead of just removing the tombstones by compacting the table, we can decrease the page
size when catching the exception and then try to perform the hinted handoff again.This will
decrease the number of deleted columns below the tombstonefailurethreshold.
2-a better design choice in my opinion is to use another type of queries, one that gets all
rows with a given size regardless of being a live row or not.SliceQueries are not the suitable
type for this task since hinted handoff deliver doesn't care for the status of the given row.

> HintedHandoff - expired hints may block future hints deliveries
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-6998
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6998
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: - cluster of two DCs: DC1, DC2
> - keyspace using NetworkTopologyStrategy (replication factors for both DCs)
> - heavy load (write:read, 100:1) with LOCAL_QUORUM using Java driver setup with DC awareness,
writing to DC1
>            Reporter: Scooletz
>              Labels: HintedHandoff, TTL
>             Fix For: 2.0.3
>
>
> For tests purposes, DC2 was shut down for 1 day. The _hints_ table was filled with millions
of rows. Now, when _HintedHandOffManager_ tries to _doDeliverHintsToEndpoint_  it queries
the store with QueryFilter.getSliceFilter which counts deleted (TTLed) cells and throws org.apache.cassandra.db.filter.TombstoneOverwhelmingException.

> Throwing this exception stops the manager from running compaction as it is run only after
successful handoff. This leaves the HH practically disabled till administrator runs truncateAllHints.

> Wouldn't it be nicer if on org.apache.cassandra.db.filter.TombstoneOverwhelmingException
run compaction? That would remove TTLed hints leaving whole HH mechanism in a healthy state.
> The stacktrace is:
> {quote}
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException
> 	at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
> 	at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306)
> 	at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351)
> 	at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309)
> 	at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92)
> 	at org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> 	at java.lang.Thread.run(Thread.java:722)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message