From user-return-16412-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed May 4 00:28:07 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3D86D2BBD for ; Wed, 4 May 2011 00:28:07 +0000 (UTC) Received: (qmail 3743 invoked by uid 500); 4 May 2011 00:28:05 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 3719 invoked by uid 500); 4 May 2011 00:28:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 3711 invoked by uid 99); 4 May 2011 00:28:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 May 2011 00:28:05 +0000 X-ASF-Spam-Status: No, hits=-2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tamara.alexander@accenture.com designates 170.252.248.70 as permitted sender) Received: from [170.252.248.70] (HELO amrmr1001.accenture.com) (170.252.248.70) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 May 2011 00:27:58 +0000 Received: from AMRXE3003.dir.svc.accenture.com (AMRXE3003.dir.svc.accenture.com [10.63.35.203]) by amrmr1001.accenture.com (8.13.8/8.13.8) with ESMTP id p440VpVT020924 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=FAIL) for ; Tue, 3 May 2011 19:32:19 -0500 (CDT) Received: from AMRXH3004.dir.svc.accenture.com (10.63.34.26) by AMRXE3003.dir.svc.accenture.com (10.63.35.203) with Microsoft SMTP Server (TLS) id 8.3.106.1; Tue, 3 May 2011 20:27:16 -0400 Received: from AMRXM3124.dir.svc.accenture.com ([10.63.34.14]) by AMRXH3004.dir.svc.accenture.com ([10.63.34.26]) with mapi; Tue, 3 May 2011 20:27:16 -0400 From: To: Date: Tue, 3 May 2011 20:29:28 -0400 Subject: Decommissioning node is causing broken pipe error Thread-Topic: Decommissioning node is causing broken pipe error Thread-Index: AcwJ8F+tTC93z1oWRre+q+Bp/Qrxgw== Message-ID: <6536AD0871E35F4888BB090DD6ECF4806CD393634E@AMRXM3124.dir.svc.accenture.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US x-ems-proccessed: vrAiQuOOcsXVFhS7ec6D4A== x-ems-stamp: zPX7daYFQ40uL+PceHJKrQ== Content-Type: multipart/alternative; boundary="_000_6536AD0871E35F4888BB090DD6ECF4806CD393634EAMRXM3124dirs_" MIME-Version: 1.0 --_000_6536AD0871E35F4888BB090DD6ECF4806CD393634EAMRXM3124dirs_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi all, I ran decommission on a node in my 32 node cluster. After about an hour of = streaming files to another node, I got this error on the node being decommi= ssioned: INFO [MiscStage:1] 2011-05-03 21:49:00,235 StreamReplyVerbHandler.java (lin= e 58) Need to re-stream file /raiddrive/MDR/MeterRecords-f-2283-Data.db to = /10.206.63.208 ERROR [Streaming:1] 2011-05-03 21:49:01,580 DebuggableThreadPoolExecutor.ja= va (line 103) Error in ThreadPoolExecutor java.lang.RuntimeException: java.io.IOException: Broken pipe at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.j= ava:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoo= lExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.ja= va:415) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamT= ask.java:105) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileSt= reamTask.java:67) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.j= ava:30) ... 3 more ERROR [Streaming:1] 2011-05-03 21:49:01,581 AbstractCassandraDaemon.java (l= ine 112) Fatal exception in thread Thread[Streaming:1,1,main] java.lang.RuntimeException: java.io.IOException: Broken pipe at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.j= ava:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoo= lExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.ja= va:415) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamT= ask.java:105) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileSt= reamTask.java:67) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.j= ava:30) ... 3 more And this message on the node that it was streaming to: INFO [Thread-333] 2011-05-03 21:49:00,234 StreamInSession.java (line 121) S= treaming of file /raiddrive/MDR/MeterRecords-f-2283-Data.db/(98605680685,19= 7932763967) progress=3D49016107008/99327083282 - 49% from org.apache.cassandra= .streaming.StreamInSession@33721219 failed: requesting a retry. I tried running decommission again (and running scrub + decommission), but = I keep getting this error on the same file. I checked out the file and saw that it is a lot bigger than all the other s= stables... 184GB instead of about 74MB. I haven't run a major compaction fo= r a bit, so I'm trying to stream 658 sstables. I'm using Cassandra 0.7.4, I have two data directories (I know that's not g= ood practice...), and all my nodes are on Amazon EC2. Any thoughts on what could be going on or how to prevent this? Thanks! Tamara ________________________________ This message is for the designated recipient only and may contain privilege= d, proprietary, or otherwise private information. If you have received it i= n error, please notify the sender immediately and delete the original. Any = other use of the email by you is prohibited. --_000_6536AD0871E35F4888BB090DD6ECF4806CD393634EAMRXM3124dirs_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi all,

 

I ran decommission on a node in my 32 node cluster. = After about an hour of streaming files to another node, I got this error on= the node being decommissioned:

INFO [MiscStage:1] 2011-05-03 21:49:00,235 StreamRep= lyVerbHandler.java (line 58) Need to re-stream file /raiddrive/MDR/MeterRec= ords-f-2283-Data.db to /10.206.63.208

ERROR [Streaming:1] 2011-05-03 21:49:01,580 Debuggab= leThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor=

java.lang.RuntimeException: java.io.IOException: Bro= ken pipe

        at org.ap= ache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)

        at java.u= til.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:88= 6)

        at java.u= til.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

        at java.l= ang.Thread.run(Thread.java:662)

Caused by: java.io.IOException: Broken pipe

        at sun.ni= o.ch.FileChannelImpl.transferTo0(Native Method)

        at sun.ni= o.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)

        at sun.ni= o.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)

        at org.ap= ache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:105)

        at org.ap= ache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:67)=

        at org.ap= ache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)

        ... 3 mor= e

ERROR [Streaming:1] 2011-05-03 21:49:01,581 Abstract= CassandraDaemon.java (line 112) Fatal exception in thread Thread[Streaming:= 1,1,main]

java.lang.RuntimeException: java.io.IOException: Bro= ken pipe

        at org.ap= ache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)

        at java.u= til.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:88= 6)

        at java.u= til.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

        at java.l= ang.Thread.run(Thread.java:662)

Caused by: java.io.IOException: Broken pipe

        at sun.ni= o.ch.FileChannelImpl.transferTo0(Native Method)

        at sun.ni= o.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)

        at sun.ni= o.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)

        at org.ap= ache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:105)

        at org.ap= ache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:67)=

        at org.ap= ache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)

        ... 3 mor= e

 

And this message on the node that it was streaming t= o:

INFO [Thread-333] 2011-05-03 21:49:00,234 StreamInSe= ssion.java (line 121) Streaming of file /raiddrive/MDR/MeterRecords-f-2283-= Data.db/(98605680685,197932763967)

         pro= gress=3D49016107008/99327083282 - 49% from org.apache.cassandra.streaming.S= treamInSession@33721219 failed: requesting a retry.

 

I tried running decommission again (and running scru= b + decommission), but I keep getting this error on the same file.=

 

I checked out the file and saw that it is a lot bigg= er than all the other sstables… 184GB instead of about 74MB. I haven&= #8217;t run a major compaction for a bit, so I’m trying to stream 658= sstables.

 

I’m using Cassandra 0.7.4, I have two data dir= ectories (I know that’s not good practice…), and all my nodes a= re on Amazon EC2.

 

Any thoughts on what could be going on or how to pre= vent this?

 

Thanks!

Tamara

 

 



This message is for the desi= gnated recipient only and may contain privileged, proprietary, or otherwise= private information. If you have received it in error, please notify the s= ender immediately and delete the original. Any other use of the email by you is prohibited.
--_000_6536AD0871E35F4888BB090DD6ECF4806CD393634EAMRXM3124dirs_--