cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jaydeepkumar Chovatia (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster
Date Mon, 11 Sep 2017 21:37:01 GMT


Jaydeepkumar Chovatia commented on CASSANDRA-13740:

Hi [~iamaleksey]

Can you please review my latest patch?


> Orphan hint file gets created while node is being removed from cluster
> ----------------------------------------------------------------------
>                 Key: CASSANDRA-13740
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jaydeepkumar Chovatia
>            Assignee: Jaydeepkumar Chovatia
>            Priority: Minor
>             Fix For: 3.0.x, 3.11.x
>         Attachments: 13740-3.0.15.txt,
> I have found this new issue during my test, whenever node is being removed then hint
file for that node gets written and stays inside the hint directory forever. I debugged the
code and found that it is due to the race condition between [
and [ |]
> . 
> *Time t1* Node is down, as a result Hints are being written by [
> *Time t2* Node is removed from cluster as a result it calls [
which removes hint files for the node being removed
> *Time t3* Mutation stage keeps pumping Hints through [ |]
which again calls [ |]
and new orphan file gets created
> I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that helped me reproduce
this new bug. I will submit patch for this new dtest later.
> I also tried following to check how this orphan hint file responds:
> 1. I tried {{nodetool truncatehints <node>}} but it fails as node is no longer
part of the ring
> 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint file because
it is not yet included in the [dispatchDequeue |]
> Reproducible steps:
> Please find dTest python file {{}} attached which reproduces this
> Solution:
> This is due to race condition as mentioned above. Since {{}} creates
thread pool with only 1 worker, so solution becomes little simple. Whenever we [
a host, just store it in-memory, and check for already evicted host inside [
If already evicted host is found then ignore hints.
> Jaydeep

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message