incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fredrik L Stigbäck <fredrik.l.stigb...@sitevision.se>
Subject Re: Hinted handoff bug?
Date Thu, 01 Dec 2011 10:22:15 GMT
Yes, I'll do that.

/Fredrik
Sylvain Lebresne skrev 2011-12-01 11:10:
> You're right, good catch.
> Do you mind opening a ticket on jira
> (https://issues.apache.org/jira/browse/CASSANDRA)?
>
> --
> Sylvain
>
> On Thu, Dec 1, 2011 at 10:03 AM, Fredrik L Stigbäck
> <fredrik.l.stigback@sitevision.se>  wrote:
>> Hi,
>> We,re running cassandra 1.0.3.
>> I've done some testing with 2 nodes (node A, node B), replication factor 2.
>> I take node A down, writing some data to node B and then take node A up.
>> Sometimes hints aren't delivered when node A comes up.
>>
>> I've done some debugging in org.apache.cassandra.db.HintedHandOffManager and
>> sometimes node B ends up in a strange state in method
>> org.apache.cassandra.db.HintedHandOffManager.deliverHints(final InetAddress
>> to), where org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries
>> already has node A in it's Set and therefore no hints will ever be delivered
>> to node A.
>> The only reason for this that I can see is that in
>> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(InetAddress
>> endpoint) the hintStore.isEmpty() check returns true and the endpoint (node
>> A)  isn't removed from
>> org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries. Then no hints
>> will ever be delivered again until node B is restarted.
>> During what conditions will hintStore.isEmpty() return true?
>> Shouldn't the hintStore.isEmpty() check be inside the try {} finally{}
>> clause, removing the endpoint from queuedDeliveries in the finally block?
>>
>> public void deliverHints(final InetAddress to)
>> {
>>          logger_.debug("deliverHints to {}", to);
>>          if (!queuedDeliveries.add(to))
>>              return;
>>          .......
>> }
>>
>> private void deliverHintsToEndpoint(InetAddress endpoint) throws
>> IOException, DigestMismatchException, InvalidRequestException,
>> TimeoutException,
>> {
>>          ColumnFamilyStore hintStore =
>> Table.open(Table.SYSTEM_TABLE).getColumnFamilyStore(HINTS_CF);
>>          if (hintStore.isEmpty())
>>              return; // nothing to do, don't confuse users by logging a no-op
>> handoff
>>      try
>>      {
>>          ......
>>      }
>>      finally
>>      {
>>              queuedDeliveries.remove(endpoint);
>>      }
>> }
>>
>> Regards
>> /Fredrik


Mime
View raw message