Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B14A319C2F for ; Sun, 28 Feb 2016 18:32:18 +0000 (UTC) Received: (qmail 56621 invoked by uid 500); 28 Feb 2016 18:32:18 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 56597 invoked by uid 500); 28 Feb 2016 18:32:18 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 56577 invoked by uid 99); 28 Feb 2016 18:32:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 28 Feb 2016 18:32:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 2529C2C14F2 for ; Sun, 28 Feb 2016 18:32:18 +0000 (UTC) Date: Sun, 28 Feb 2016 18:32:18 +0000 (UTC) From: "Jason Kania (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (CASSANDRA-11273) Exceptions during bootstrap cause bootstrap to hang (WORKAROUND) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Jason Kania created CASSANDRA-11273: --------------------------------------- Summary: Exceptions during bootstrap cause bootstrap to hang (WORKAROUND) Key: CASSANDRA-11273 URL: https://issues.apache.org/jira/browse/CASSANDRA-11273 Project: Cassandra Issue Type: Bug Components: Lifecycle Environment: debian jesse patch current running Cassandra 3.0.3 Reporter: Jason Kania When running bootstrap on a new node, the following problem can occur because Cassandra fails to recognize columns for some reason. The error prevents the bootstrap from finishing and hangs the bootstrap. If the bootstrap is resumed, it will get the same error and bootstrap cannot be completed. from 192.168.10.8 ERROR [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamSession.java:635 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Remote peer 192.168.10.10 failed stream session. INFO [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamResultFuture.java:182 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Session with /192.168.10.10 is complete WARN [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,858 StreamResultFuture.java:209 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Stream failed from 192.168.10.8 debug DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,414 ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Received Received (79256340-bbbb-11e5-9f70-7d76a8de8480, #0) DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,854 ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Received Retry (f3a137e0-024b-11e5-bb31-0d2316086bf7, #0) DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 ConnectionHandler.java:334 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Sending File (Header (cfId: f3a137e0-024b-11e5-bb31-0d2316086bf7, #0, version: ma, format: BIG, estimated keys: 128, transfer size: 4653, compressed?: true, repairedAt: 0, level: 0), file: /home/cassandra/data/sensordb/sensor/ma-76-big-Data.db) DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 CompressedStreamWriter.java:63 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Start streaming file /home/cassandra/data/sensordb/sensor/ma-76-big-Data.db to /192.168.10.10, repairedAt = 0, totalSize = 4653 DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 CompressedStreamWriter.java:94 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Finished streaming file /home/cassandra/data/sensordb/sensor/ma-76-big-Data.db to /192.168.10.10, bytesTransferred = 4653, totalSize = 4653 DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,855 ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Received Retry (faa55490-024b-11e5-bb31-0d2316086bf7, #0) DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,855 ConnectionHandler.java:334 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Sending File (Header (cfId: faa55490-024b-11e5-bb31-0d2316086bf7, #0, version: ma, format: BIG, estimated keys: 128, transfer size: 705, compressed?: true, repairedAt: 0, level: 0), file: /home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db) DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,856 CompressedStreamWriter.java:63 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Start streaming file /home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db to /192.168.10.10, repairedAt = 0, totalSize = 705 DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,856 CompressedStreamWriter.java:94 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Finished streaming file /home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db to /192.168.10.10, bytesTransferred = 705, totalSize = 705 DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Received Session Failed ERROR [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamSession.java:635 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Remote peer 192.168.10.10 failed stream session. DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 ConnectionHandler.java:110 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Closing stream connection handler on /192.168.10.10 INFO [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamResultFuture.java:182 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Session with /192.168.10.10 is complete WARN [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,858 StreamResultFuture.java:209 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Stream failed from 192.168.10.10 [2016-02-27 20:37:53,413] received file /home/cassandra/data/sensordb/listedAttributes-79256340bbbb11e59f707d76a8de8480/ma-32-big-Data.db (progress: 365%) [2016-02-27 20:37:53,414] received file /home/cassandra/data/sensordb/liestedAttributes-79256340bbbb11e59f707d76a8de8480/ma-32-big-Data.db (progress: 369%) [2016-02-27 20:37:53,865] session with /192.168.10.8 complete (progress: 369%) [2016-02-27 20:37:53,866] Stream failed from 192.168.10.10 debug DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,201 CompressedStreamReader.java:80 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Start receiving file #0 from /192.168.10.8, repairedAt = 0, size = 166627, ks = 'sensordb', table = 'listAttributes'. DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,412 CompressedStreamReader.java:110 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Finished receiving file #0 from /192.168.10.8 readBytes = 166627, totalSize = 166627 DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,412 ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Received File (Header (cfId: 79256340-bbbb-11e5-9f70-7d76a8de8480, #0, version: ma, format: BIG, estimated keys: 128, transfer size: 166627, compressed?: true, repairedAt: 0, level: 0), file: /home/cassandra/data/sensordb/listAttributes-79256340bbbb11e59f707d76a8de8480/ma-32-big-Data.db) DEBUG [STREAM-OUT-/192.168.10.8] 2016-02-27 20:37:53,412 ConnectionHandler.java:334 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Sending Received (79256340-bbbb-11e5-9f70-7d76a8de8480, #0) DEBUG [CompactionExecutor:3] 2016-02-27 20:37:53,833 CompactionTask.java:217 - Compacted (e224bef0-ddbb-11e5-80c0-89f591237aca) 4 sstables to [/home/cassandra/data/system_distributed/parent_repair_history-deabd734b99d3b9c92e5fd92eb5abf14/ma-5-big,] to level=0. 2,743,164 bytes to 685,791 (~25% of original) in 1,096ms = 0.596735MB/s. 0 total partitions merged to 57. Partition merge counts were {4:57, } DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,850 CompressedStreamReader.java:80 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Start receiving file #0 from /192.168.10.8, repairedAt = 0, size = 4653, ks = 'sensordb', table = 'sensor'. WARN [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,851 StreamSession.java:641 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Retrying for following error java.lang.RuntimeException: Unknown column lastEvaluation during deserialization at org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331) ~[apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:87) ~[apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:50) [apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39) [apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59) [apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261) [apache-cassandra-3.0.3.jar:3.0.3] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74] DEBUG [STREAM-OUT-/192.168.10.8] 2016-02-27 20:37:53,852 ConnectionHandler.java:334 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Sending Retry (f3a137e0-024b-11e5-bb31-0d2316086bf7, #0) DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,852 ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Received null DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,853 CompressedStreamReader.java:80 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Start receiving file #0 from /192.168.10.8, repairedAt = 0, size = 705, ks = 'sensordb', table = 'sensorUnit'. WARN [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,854 StreamSession.java:641 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Retrying for following error java.lang.RuntimeException: Unknown column lastCheckTime during deserialization at org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331) ~[apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:87) ~[apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:50) [apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39) [apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59) [apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261) [apache-cassandra-3.0.3.jar:3.0.3] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74] DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,854 ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Received null Possible Work around To resolve this, it is possible to do the following: 1) in cqlsh on the new node use system; select host_id from local 2) Save that host_id uuid for later use 3) Change the cassandra.yaml to set auto_bootstrap to false 4) Stop the database on the new node 5) Remove all the contents of the data directory on the new node 6) Copy all files from the data directory on an existing replica node to the data directory on new node 7) Start Cassandra on the new node in network isolation or restart Cassandra on the other nodes in the cluster after starting the new node 8) In cqlsh on the new node use system; update local set host_id=,tokens=null where key='local'; update local set broadcast_address='',listen_address='',rpc_address='' where key='local'; 9) On the new node run the following to save the updated system data nodetool flush system local 10) Restart cassandra on the new node 11) Run the following on the new node to generate the data tokens nodetool repair -- This message was sent by Atlassian JIRA (v6.3.4#6332)