cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: Hinted handoff bug?
Date Thu, 01 Dec 2011 10:10:14 GMT
You're right, good catch.
Do you mind opening a ticket on jira
(https://issues.apache.org/jira/browse/CASSANDRA)?

--
Sylvain

On Thu, Dec 1, 2011 at 10:03 AM, Fredrik L Stigbäck
<fredrik.l.stigback@sitevision.se> wrote:
> Hi,
> We,re running cassandra 1.0.3.
> I've done some testing with 2 nodes (node A, node B), replication factor 2.
> I take node A down, writing some data to node B and then take node A up.
> Sometimes hints aren't delivered when node A comes up.
>
> I've done some debugging in org.apache.cassandra.db.HintedHandOffManager and
> sometimes node B ends up in a strange state in method
> org.apache.cassandra.db.HintedHandOffManager.deliverHints(final InetAddress
> to), where org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries
> already has node A in it's Set and therefore no hints will ever be delivered
> to node A.
> The only reason for this that I can see is that in
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(InetAddress
> endpoint) the hintStore.isEmpty() check returns true and the endpoint (node
> A)  isn't removed from
> org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries. Then no hints
> will ever be delivered again until node B is restarted.
> During what conditions will hintStore.isEmpty() return true?
> Shouldn't the hintStore.isEmpty() check be inside the try {} finally{}
> clause, removing the endpoint from queuedDeliveries in the finally block?
>
> public void deliverHints(final InetAddress to)
> {
>         logger_.debug("deliverHints to {}", to);
>         if (!queuedDeliveries.add(to))
>             return;
>         .......
> }
>
> private void deliverHintsToEndpoint(InetAddress endpoint) throws
> IOException, DigestMismatchException, InvalidRequestException,
> TimeoutException,
> {
>         ColumnFamilyStore hintStore =
> Table.open(Table.SYSTEM_TABLE).getColumnFamilyStore(HINTS_CF);
>         if (hintStore.isEmpty())
>             return; // nothing to do, don't confuse users by logging a no-op
> handoff
>     try
>     {
>         ......
>     }
>     finally
>     {
>             queuedDeliveries.remove(endpoint);
>     }
> }
>
> Regards
> /Fredrik

Mime
View raw message