cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Léo FERLIN SUTTON <lfer...@mailjet.com.INVALID>
Subject Re: Bootstrap keeps failing
Date Fri, 08 Feb 2019 14:16:18 GMT
On Thu, Feb 7, 2019 at 10:11 PM Kenneth Brotman
<kenbrotman@yahoo.com.invalid> wrote:

> Lots of things come to mind. We need more information from you to help us
> understand:
>
> How long have you had your cluster running?
>
A bit more than a year old. But it has been constantly growing (3 nodes to
6 nodes to 12 nodes, etc).
We have a replication_factor of 3 on all keyspaces and 3 racks with an
equal amount of nodes.

Is it generally working ok?
>
Works fine. Good performance, repairs managed by cassandra-reaper.

Is it just one node that is misbehaving at a time?
>
We only bootstrap nodes one at a time. Sometimes it works flawlessly,
sometimes it fails. When it fails it tends to fail a lot in a row before we
manage to get it bootstrapped.

How many nodes do you need to replace?
>
I am adding nodes, not replacing any. Our nodes are starting to get very
full and we wish to add at least 6 more nodes (short-term).
Adding a new node is quite slow (48 to 72 hours) and that's when the
boostrap process works at the first try.

Are you doing rolling restarts instead of simultaneously?
>
Yes.

Do you have enough capacity on your machines?  Did you say some of the
> nodes are at 90% capacity?
>
The free disk space left fluctuates but is generally between 80% and 90%,
this is why we are planning to add a lot more nodes.

When did this problem begin?
>
 Not sure about this one. Probably since our nodes have more than 2to data,
I don't remember it being an issue when our nodes were smaller.

Could something be causing a racing condition?
>
We have schema changes every day.
We have temporary data stored in cassandra, only used for 6 days then
destroyed.

In order to avoid tombstones we have a table rotation, every day we create
a new table to contain the data for the next day, and we drop the oldest
temporary table.

This means that when the node starts to bootstrap it will ask other nodes
for data that will almost certainly be dropped before the boostrap process
is finished.

Did you recheck the commands you used to make sure they are correct?
>
What procedure do you use?
>

Our procedure is :

   1. We install cassandra on a brand new instance (debian).
   2. We install cassandra.
   3. We stop the default cassandra (launched by the debian package).
   4. We empty these directories :
   /var/lib/cassandra/commitlog
   /var/lib/cassandra/data
   /var/lib/cassandra/saved_caches
   5. We put our configuration in place of the default one.
   6. We start the cassandra.

If after 3 days we see that the node hasn't joined the cluster we check the
`nodetool netstats` command to see if the node is still streaming data. If
it is not we launch `nodetool bootstrap resume` on the instance.

Thank you for you interest in our issue !

Regards,

Leo


>
>
> *From:* Léo FERLIN SUTTON [mailto:lferlin@mailjet.com.INVALID]
> *Sent:* Thursday, February 07, 2019 9:16 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: [EXTERNAL] Re: Bootstrap keeps failing
>
>
>
> Thank you for the recommendation.
>
>
>
> We are already using datastax's recommended settings for tcp_keepalive
>
>
>
> Regards,
>
>
>
> Leo
>
>
>
> On Thu, Feb 7, 2019 at 5:49 PM Durity, Sean R <SEAN_R_DURITY@homedepot.com>
> wrote:
>
> I have seen unreliable streaming (streaming that doesn’t finish) because
> of TCP timeouts from firewalls or switches. The default tcp_keepalive
> kernel parameters are usually not tuned for that. See
> https://docs.datastax.com/en/dse-trblshoot/doc/troubleshooting/idleFirewallLinux.html
> for more details. These “remote” timeouts are difficult to detect or prove
> if you don’t have access to the intermediate network equipment.
>
>
>
> Sean Durity
>
> *From:* Léo FERLIN SUTTON <lferlin@mailjet.com.INVALID>
> *Sent:* Thursday, February 07, 2019 10:26 AM
> *To:* user@cassandra.apache.org; dinesh.joshi@yahoo.com
> *Subject:* [EXTERNAL] Re: Bootstrap keeps failing
>
>
>
> Hello !
>
> Thank you for your answers.
>
>
>
> So I have tried, multiple times, to start bootstrapping from scratch. I
> often have the same problem (on other nodes as well) but sometimes it works
> and I can move on to another node.
>
>
>
> I have joined a jstack dump and some logs.
>
>
>
> Our node was shut down at around 97% disk space used
>
> I turned it back on and it starting the bootstrap process again.
>
>
>
> The log file is the log from this attempt, same for the thread dump.
>
>
>
> Small warning, I have somewhat anonymised the log files so there may be
> some inconsistencies.
>
>
>
> Regards,
>
>
>
> Leo
>
>
>
> On Thu, Feb 7, 2019 at 8:13 AM dinesh.joshi@yahoo.com.INVALID <
> dinesh.joshi@yahoo.com.invalid <dinesh.joshi@yahoocom.invalid>> wrote:
>
> Would it be possible for you to take a thread dump & logs and share them?
>
>
>
> Dinesh
>
>
>
>
>
> On Wednesday, February 6, 2019, 10:09:11 AM PST, Léo FERLIN SUTTON <
> lferlin@mailjet.com.INVALID> wrote:
>
>
>
>
>
> Hello !
>
>
>
> I am having a recurrent problem when trying to bootstrap a few new nodes.
>
>
>
> Some general info :
>
>    - I am running cassandra 3.0.17
>    - We have about 30 nodes in our cluster
>    - All healthy nodes have between 60% to 90% used disk space on
>    /var/lib/cassandra
>
> So I create a new node and let auto_bootstrap do it's job. After a few
> days the bootstrapping node stops streaming new data but is still not a
> member of the cluster.
>
>
>
> `nodetool status` says the node is still joining,
>
>
>
> When this happens I run `nodetool bootstrap resume`. This usually ends up
> in two different ways :
>
>    1. The node fills up to 100% disk space and crashes.
>    2. The bootstrap resume finishes with errors
>
> When I look at `nodetool netstats -H` is  looks like `bootstrap resume`
> does not resume but restarts a full transfer of every data from every node.
>
>
>
> This is the output I get from `nodetool resume` :
>
> [2019-02-06 01:39:14,369] received file
> /var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-225-big-Data.db
> (progress: 2113%)
>
> [2019-02-06 01:39:16,821] received file
> /var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-88-big-Data.db
> (progress: 2113%)
>
> [2019-02-06 01:39:17,003] received file
> /var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-89-big-Data.db
> (progress: 2113%)
>
> [2019-02-06 01:39:17,032] session with /10.16.XX.YYY complete (progress:
> 2113%)
>
> [2019-02-06 01:41:15,160] received file
> /var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-220-big-Data.db
> (progress: 2113%)
>
> [2019-02-06 01:42:02,864] received file
> /var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-226-big-Data.db
> (progress: 2113%)
>
> [2019-02-06 01:42:09,284] received file
> /var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-227-big-Data.db
> (progress: 2113%)
>
> [2019-02-06 01:42:10,522] received file
> /var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-228-big-Data.db
> (progress: 2113%)
>
> [2019-02-06 01:42:10,622] received file
> /var/lib/cassandra/raw/raw_17930-d7cc0590230d11e9bc0af381b0ee7ac6/mc-229-big-Data.db
> (progress: 2113%)
>
> [2019-02-06 01:42:11,925] received file
> /var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-90-big-Data.db
> (progress: 2114%)
>
> [2019-02-06 01:42:14,887] received file
> /var/lib/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/mc-91-big-Data.db
> (progress: 2114%)
>
> [2019-02-06 01:42:14,980] session with /10.16.XX.ZZZ complete (progress:
> 2114%)
>
> [2019-02-06 01:42:14,980] Stream failed
>
> [2019-02-06 01:42:14,982] Error during bootstrap: Stream failed
>
> [2019-02-06 01:42:14,982] Resume bootstrap complete
>
>
>
> The bootstrap `progress` goes way over 100% and eventually fails.
>
>
>
>
>
> Right now I have a node with this output from `nodetool status` :
>
> `UJ  10.16.XX.YYY  2.93 TB    256          ?
>  5788f061-a3c0-46af-b712-ebeecd397bf7  c`
>
>
>
> It is almost filled with data, yet if I look at `nodetool netstats` :
>
>         Receiving 480 files, 325.39 GB total. Already received 5 files,
> 68.32 MB total
>         Receiving 499 files, 328.96 GB total. Already received 1 files,
> 1.32 GB total
>         Receiving 506 files, 345.33 GB total. Already received 6 files,
> 24.19 MB total
>         Receiving 362 files, 206.73 GB total. Already received 7 files, 34
> MB total
>         Receiving 424 files, 281.25 GB total. Already received 1 files,
> 1.3 GB total
>         Receiving 581 files, 349.26 GB total. Already received 8 files,
> 45.96 MB total
>         Receiving 443 files, 337.26 GB total. Already received 6 files,
> 96.15 MB total
>         Receiving 424 files, 275.23 GB total. Already received 5 files,
> 42.67 MB total
>
>
>
> It is trying to pull all the data again.
>
>
>
> Am I missing something about the way `nodetool bootstrap resume` is
> supposed to be used ?
>
>
>
> Regards,
>
>
>
> Leo
>
>
>
>
> ------------------------------
>
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>

Mime
View raw message