incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Hinted handoff bug?
Date Fri, 02 Dec 2011 05:49:05 GMT
Nope, that's a separate issue.
https://issues.apache.org/jira/browse/CASSANDRA-3554

On Thu, Dec 1, 2011 at 5:59 PM, Terje Marthinussen
<tmarthinussen@gmail.com> wrote:
> Sorry for not checking source to see if things have changed but i just remembered an
issue I have forgotten to make jira for.
>
> In old days, nodes would periodically try to deliver queues.
>
> However, this was at some stage changed so it only deliver if a node is being marked
up.
>
> However, you can definitely have a scenario where  A fails to deliver to B so it send
the hint to C instead.
>
> However, B is not really down, it just could not accept that packet at that time and
C always (correctly in this case) thinks B is up and it never tries to deliver the hints to
B.
>
> Will this change fix this, or do we need to get back the thread that periodically tried
to deliver hints regardless of node status changes?
>
> Regards,
> Terje
>
> On 1 Dec 2011, at 19:10, Sylvain Lebresne <sylvain@datastax.com> wrote:
>
>> You're right, good catch.
>> Do you mind opening a ticket on jira
>> (https://issues.apache.org/jira/browse/CASSANDRA)?
>>
>> --
>> Sylvain
>>
>> On Thu, Dec 1, 2011 at 10:03 AM, Fredrik L Stigbäck
>> <fredrik.l.stigback@sitevision.se> wrote:
>>> Hi,
>>> We,re running cassandra 1.0.3.
>>> I've done some testing with 2 nodes (node A, node B), replication factor 2.
>>> I take node A down, writing some data to node B and then take node A up.
>>> Sometimes hints aren't delivered when node A comes up.
>>>
>>> I've done some debugging in org.apache.cassandra.db.HintedHandOffManager and
>>> sometimes node B ends up in a strange state in method
>>> org.apache.cassandra.db.HintedHandOffManager.deliverHints(final InetAddress
>>> to), where org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries
>>> already has node A in it's Set and therefore no hints will ever be delivered
>>> to node A.
>>> The only reason for this that I can see is that in
>>> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(InetAddress
>>> endpoint) the hintStore.isEmpty() check returns true and the endpoint (node
>>> A)  isn't removed from
>>> org.apache.cassandra.db.HintedHandOffManager.queuedDeliveries. Then no hints
>>> will ever be delivered again until node B is restarted.
>>> During what conditions will hintStore.isEmpty() return true?
>>> Shouldn't the hintStore.isEmpty() check be inside the try {} finally{}
>>> clause, removing the endpoint from queuedDeliveries in the finally block?
>>>
>>> public void deliverHints(final InetAddress to)
>>> {
>>>         logger_.debug("deliverHints to {}", to);
>>>         if (!queuedDeliveries.add(to))
>>>             return;
>>>         .......
>>> }
>>>
>>> private void deliverHintsToEndpoint(InetAddress endpoint) throws
>>> IOException, DigestMismatchException, InvalidRequestException,
>>> TimeoutException,
>>> {
>>>         ColumnFamilyStore hintStore =
>>> Table.open(Table.SYSTEM_TABLE).getColumnFamilyStore(HINTS_CF);
>>>         if (hintStore.isEmpty())
>>>             return; // nothing to do, don't confuse users by logging a
no-op
>>> handoff
>>>     try
>>>     {
>>>         ......
>>>     }
>>>     finally
>>>     {
>>>             queuedDeliveries.remove(endpoint);
>>>     }
>>> }
>>>
>>> Regards
>>> /Fredrik



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message