Hello,
I am using BulkOutputFormat to load data from a .csv file into Cassandra. I am using Cassandra
1.1.3 and Hadoop 0.20.2.I have 7 hadoop nodes: 1 namenode/jobtracker and 6 datanodes/tasktrackers.
Cassandra is installed on 4 of these 6 datanodes/tasktrackers.The issue happens when I have
more than 1 reducer, SSTables are generated in each node, however, I get the following error
in the tasktracker's logs when they are streamed into the Cassandra cluster:
Exception in thread "Streaming to /172.16.110.79:1" java.lang.RuntimeException: java.io.EOFException
at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
Exception in thread "Streaming to /172.16.110.92:1" java.lang.RuntimeException: java.io.EOFException
at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more ...
This is what I get in the logs of one of my Cassandra nodes:ERROR 16:47:34,904 Sending retry
message failed, closing session.
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(Unknown Source)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.write(Unknown Source)
at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
at java.nio.channels.Channels.writeFullyImpl(Unknown Source)
at java.nio.channels.Channels.writeFully(Unknown Source)
at java.nio.channels.Channels.access$000(Unknown Source)
at java.nio.channels.Channels$1.write(Unknown Source)
at java.io.OutputStream.write(Unknown Source)
at java.nio.channels.Channels$1.write(Unknown Source)
at java.io.DataOutputStream.writeInt(Unknown Source)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:196)
at org.apache.cassandra.streaming.StreamInSession.sendMessage(StreamInSession.java:171)
at org.apache.cassandra.streaming.StreamInSession.retry(StreamInSession.java:160)
at org.apache.cassandra.streaming.IncomingStreamReader.retry(IncomingStreamReader.java:168)
at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:98)
at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)
Does anyone know what caused these errors?
Thank you for your help.Regards,Ralph
|