Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 95C92CD57 for ; Tue, 6 Aug 2013 01:09:06 +0000 (UTC) Received: (qmail 68211 invoked by uid 500); 6 Aug 2013 01:09:04 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 68171 invoked by uid 500); 6 Aug 2013 01:09:04 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 68163 invoked by uid 99); 6 Aug 2013 01:09:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Aug 2013 01:09:04 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of kwright@nanigans.com designates 216.82.254.97 as permitted sender) Received: from [216.82.254.97] (HELO mail1.bemta7.messagelabs.com) (216.82.254.97) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Aug 2013 01:08:59 +0000 Received: from [216.82.253.99:32250] by server-1.bemta-7.messagelabs.com id 65/EB-07829-59C40025; Tue, 06 Aug 2013 01:08:37 +0000 X-Env-Sender: kwright@nanigans.com X-Msg-Ref: server-3.tower-160.messagelabs.com!1375751315!7899571!1 X-Originating-IP: [216.166.12.178] X-StarScan-Received: X-StarScan-Version: 6.9.11; banners=-,-,- X-VirusChecked: Checked Received: (qmail 25281 invoked from network); 6 Aug 2013 01:08:36 -0000 Received: from out001.collaborationhost.net (HELO out001.collaborationhost.net) (216.166.12.178) by server-3.tower-160.messagelabs.com with RC4-SHA encrypted SMTP; 6 Aug 2013 01:08:36 -0000 Received: from AUSP01VMBX28.collaborationhost.net ([10.2.228.36]) by AUSP01MHUB04.collaborationhost.net ([10.2.0.189]) with mapi; Mon, 5 Aug 2013 20:08:35 -0500 From: Keith Wright To: "user@cassandra.apache.org" , Don Jackson Date: Mon, 5 Aug 2013 20:08:35 -0500 Subject: Re: Unable to bootstrap node Thread-Topic: Unable to bootstrap node Thread-Index: Ac6SQUDy6+DqV25fQ5+qP1/YspXwFQAADh3r Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_fyw4u0u0mtsfhu92nhpj59ua1375751298586emailandroidcom_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_fyw4u0u0mtsfhu92nhpj59ua1375751298586emailandroidcom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Yes we likely dropped and recreated tables. If we stop the sending node, w= hat will happen to the bootstrapping node? sankalp kohli wrote: Hi, The problem is that the node sending the stream is hitting this FileNot= Found exception. You need to restart this node and it should fix the proble= m. Are you seeing lot of FileNotFoundExceptions? Did you do any schema change = recently? Sankalp On Mon, Aug 5, 2013 at 5:39 PM, Keith Wright > wrote: Hi all, I have been trying to bootstrap a new node into my 7 node 1.2.4 C* clust= er with Vnodes RF3 with no luck. It gets close to completing and then the = streaming just stalls with streaming at 99% from 1 or 2 nodes. Nodetool n= etstats shows the items that have yet to stream but the logs on the new nod= e do not show any errors. I tried shutting down then node, clearing all da= ta/commit logs/caches, and re-boot strapping with no luck. The nodes that = are hanging sending the data only have the error below but that's related t= o compactions (see below) although it is one of the files that is waiting t= o be sent. I tried nodetool scrub on the column family with the missing it= em but got an error indicating it could not get a hard link. Any ideas? W= e were able to bootstrap one of the new nodes with no issues but this other= one has been a real pain. Note that when the new node is joining the clus= ter, it does not appear in nodetool status. Is that expected? Thanks all, my next step is to try getting a new IP for this machine, my th= ought being that the cluster doesn't like me continuing to attempt to boots= trap the node repeatedly each time getting a new host id. [kwright@lxpcas008 ~]$ nodetool netstats | grep rts-40301_feedProducts-ib-1= -Data.db rts: /data/1/cassandra/data/rts/40301_feedProducts/rts-40301_feedProduct= s-ib-1-Data.db sections=3D73 progress=3D0/1884669 - 0% ERROR [ReadStage:427] 2013-08-05 23:23:29,294 CassandraDaemon.java (line 17= 4) Exception in thread Thread[ReadStage:427,5,main] java.lang.RuntimeException: java.io.FileNotFoundException: /data/1/cassandr= a/data/rts/40301_feedProducts/rts-40301_feedProducts-ib-1-Data.db (No such = file or directory) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.op= en(CompressedRandomAccessReader.java:46) at org.apache.cassandra.io.util.CompressedSegmentedFile.createReade= r(CompressedSegmentedFile.java:57) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(Poo= lingSegmentedFile.java:41) at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(S= STableReader.java:976) at org.apache.cassandra.db.columniterator.SSTableNamesIterator.crea= teFileDataInput(SSTableNamesIterator.java:98) at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read= (SSTableNamesIterator.java:117) at org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:64) at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumn= Iterator(NamesQueryFilter.java:81) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnItera= tor(QueryFilter.java:68) at org.apache.cassandra.db.CollationController.collectTimeOrderedDa= ta(CollationController.java:133) at org.apache.cassandra.db.CollationController.getTopLevelColumns(C= ollationController.java:65) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(Col= umnFamilyStore.java:1357) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(Column= FamilyStore.java:1214) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(Column= FamilyStore.java:1126) at org.apache.cassandra.db.Table.getRow(Table.java:347) at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNa= mesReadCommand.java:64) at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.j= ava:44) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDelivery= Task.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:615) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.FileNotFoundException: /data/1/cassandra/data/rts/40301_= feedProducts/rts-40301_feedProducts-ib-1-Data.db (No such file or directory= ) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.(RandomAccessFile.java:233) at org.apache.cassandra.io.util.RandomAccessReader.(RandomAcc= essReader.java:67) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.(CompressedRandomAccessReader.java:75) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.op= en(CompressedRandomAccessReader.java:42) ... 20 more --_000_fyw4u0u0mtsfhu92nhpj59ua1375751298586emailandroidcom_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Yes we likely dropped and recreated tables.  If we stop the sen=
ding node, what will happen to the bootstrapping node?=0A=
=0A=
sankalp kohli <kohlisankalp@gmail.com> wrote:=0A=
=0A=
Hi,
    The problem is that the node sending the stre= am is hitting this FileNotFound exception. You need to restart this node an= d it should fix the problem. 

Are you seeing lot of FileNotFoundExceptions? Did you do any schema change = recently?

Sankalp


On Mon, Aug 5, 2013 at 5:39 PM, Keith Wright <kwright@nanig= ans.com> wrote:
Hi all,

   I have been trying to bootstrap a new node into my 7 node= 1.2.4 C* cluster with Vnodes RF3 with no luck.  It gets close to comp= leting and then the streaming just stalls with  streaming at 99% from = 1 or 2 nodes.  Nodetool netstats shows the items that have yet to stream but the logs on the new node do not show any errors. &n= bsp;I tried shutting down then node, clearing all data/commit logs/caches, = and re-boot strapping with no luck.  The nodes that are hanging sendin= g the data only have the error below but that's related to compactions (see below) although it is one of the files that is= waiting to be sent.  I tried nodetool scrub on the column family with= the missing item but got an error indicating it could not get a hard link.=  Any ideas?  We were able to bootstrap one of the new nodes with no issues but this other one has been a real pai= n.  Note that when the new node is joining the cluster, it does not ap= pear in nodetool status.  Is that expected?

Thanks all, my next step is to try getting a new IP for this machine, = my thought being that the cluster doesn't like me continuing to attempt to = bootstrap the node repeatedly each time getting a new host id.

[kwright@lxpcas008 ~]$ nodetool netstats | grep rts-40301_feedProducts= -ib-1-Data.db
   rts: /data/1/cassandra/data/rts/40301_feedProducts/rts-40= 301_feedProducts-ib-1-Data.db sections=3D73 progress=3D0/1884669 - 0%

ERROR [ReadStage:427] 2013-08-05 23:23:29,294 CassandraDaemon.java (li= ne 174) Exception in thread Thread[ReadStage:427,5,main]
java.lang.RuntimeException: java.io.FileNotFoundException: /data/1/cas= sandra/data/rts/40301_feedProducts/rts-40301_feedProducts-ib-1-Data.db (No = such file or directory)
        at org.apache.cassandra.io.compress.Compre= ssedRandomAccessReader.open(CompressedRandomAccessReader.java:46)
        at org.apache.cassandra.io.util.Compressed= SegmentedFile.createReader(CompressedSegmentedFile.java:57)
        at org.apache.cassandra.io.util.PoolingSeg= mentedFile.getSegment(PoolingSegmentedFile.java:41)
        at org.apache.cassandra.io.sstable.SSTable= Reader.getFileDataInput(SSTableReader.java:976)
        at org.apache.cassandra.db.columniterator.= SSTableNamesIterator.createFileDataInput(SSTableNamesIterator.java:98)
        at org.apache.cassandra.db.columniterator.= SSTableNamesIterator.read(SSTableNamesIterator.java:117)
        at org.apache.cassandra.db.columniterator.= SSTableNamesIterator.<init>(SSTableNamesIterator.java:64)
        at org.apache.cassandra.db.filter.NamesQue= ryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81)
        at org.apache.cassandra.db.filter.QueryFil= ter.getSSTableColumnIterator(QueryFilter.java:68)
        at org.apache.cassandra.db.CollationContro= ller.collectTimeOrderedData(CollationController.java:133)
        at org.apache.cassandra.db.CollationContro= ller.getTopLevelColumns(CollationController.java:65)
        at org.apache.cassandra.db.ColumnFamilySto= re.getTopLevelColumns(ColumnFamilyStore.java:1357)
        at org.apache.cassandra.db.ColumnFamilySto= re.getColumnFamily(ColumnFamilyStore.java:1214)
        at org.apache.cassandra.db.ColumnFamilySto= re.getColumnFamily(ColumnFamilyStore.java:1126)
        at org.apache.cassandra.db.Table.getRow(Ta= ble.java:347)
        at org.apache.cassandra.db.SliceByNamesRea= dCommand.getRow(SliceByNamesReadCommand.java:64)
        at org.apache.cassandra.db.ReadVerbHandler= .doVerb(ReadVerbHandler.java:44)
        at org.apache.cassandra.net.MessageDeliver= yTask.run(MessageDeliveryTask.java:56)
        at java.util.concurrent.ThreadPoolExecutor= .runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor= $Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.FileNotFoundException: /data/1/cassandra/data/rts/4= 0301_feedProducts/rts-40301_feedProducts-ib-1-Data.db (No such file or dire= ctory)
        at java.io.RandomAccessFile.open(Native Me= thod)
        at java.io.RandomAccessFile.<init>(R= andomAccessFile.java:233)
        at org.apache.cassandra.io.util.RandomAcce= ssReader.<init>(RandomAccessReader.java:67)
        at org.apache.cassandra.io.compress.Compre= ssedRandomAccessReader.<init>(CompressedRandomAccessReader.java:75)
        at org.apache.cassandra.io.compress.Compre= ssedRandomAccessReader.open(CompressedRandomAccessReader.java:42)
        ... 20 more

--_000_fyw4u0u0mtsfhu92nhpj59ua1375751298586emailandroidcom_--