Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 876B4F2B2 for ; Thu, 28 Mar 2013 08:57:26 +0000 (UTC) Received: (qmail 29494 invoked by uid 500); 28 Mar 2013 08:57:23 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 29454 invoked by uid 500); 28 Mar 2013 08:57:23 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 29393 invoked by uid 99); 28 Mar 2013 08:57:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Mar 2013 08:57:21 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of michalm@opera.com designates 213.236.208.81 as permitted sender) Received: from [213.236.208.81] (HELO smtp.opera.com) (213.236.208.81) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Mar 2013 08:57:13 +0000 Received: from [10.40.170.35] (oslo.jvpn.opera.com [213.236.208.46]) (authenticated bits=0) by smtp.opera.com (8.14.3/8.14.3/Debian-5+lenny1) with ESMTP id r2S8upNh010534 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Thu, 28 Mar 2013 08:56:52 GMT Message-ID: <515405D2.9080202@opera.com> Date: Thu, 28 Mar 2013 09:56:50 +0100 From: Michal Michalski User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130308 Thunderbird/17.0.4 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Problem with streaming data from Hadoop: DecoratedKey(-1, ) Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org We're streaming data to Cassandra directly from MapReduce job using BulkOutputFormat. It's been working for more than a year without any problems, but yesterday one of 600 mappers faild and we got a strange-looking exception on one of the C* nodes. IMPORTANT: It happens on one node and on one cluster only. We've loaded the same data to test cluster and it worked. ERROR [Thread-1340977] 2013-03-28 06:35:47,695 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-1340977,5,main] java.lang.RuntimeException: Last written key DecoratedKey(5664330507961197044404922676062547179, 302c6461696c792c32303133303332352c312c646f6d61696e2c756e6971756575736572732c633a494e2c433a6d63635f6d6e635f636172726965725f43656c6c4f6e655f4b61726e6174616b615f2842616e67616c6f7265295f494e2c643a53616d73756e675f47542d49393037302c703a612c673a3133) >= current key DecoratedKey(-1, ) writing into /cassandra/production/IndexedValues/production-IndexedValues-tmp-ib-240346-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:133) at org.apache.cassandra.io.sstable.SSTableWriter.appendFromStream(SSTableWriter.java:209) at org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:179) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:226) at org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:166) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66) From what I can understand by looking into the C* source, it seems to me that the problem is caused by a empty (or surprisingly finished?) input buffer (?) causing token to be set to -1 which is improper for RandomPartitioner: public BigIntegerToken getToken(ByteBuffer key) { if (key.remaining() == 0) return MINIMUM; // Which is -1 return new BigIntegerToken(FBUtilities.hashToBigInteger(key)); } However, I can't figure out what's the root cause of this problem. Any ideas? Of course I can't exclude a bug in my code which streams these data, but - as I said - it works when loading the same data to test cluster (which has different number of nodes, thus different token assignment, which might be a case too). MichaƂ