Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 54E1FD517 for ; Sun, 1 Jul 2012 00:53:48 +0000 (UTC) Received: (qmail 45281 invoked by uid 500); 1 Jul 2012 00:53:48 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 45175 invoked by uid 500); 1 Jul 2012 00:53:48 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 45135 invoked by uid 99); 1 Jul 2012 00:53:47 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Jul 2012 00:53:47 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 879FF14283D for ; Sun, 1 Jul 2012 00:53:46 +0000 (UTC) Date: Sun, 1 Jul 2012 00:53:46 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: <2101390435.76083.1341104026557.JavaMail.jiratomcat@issues-vm> In-Reply-To: <448685896.11850.1339580142723.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Updated] (CASSANDRA-4337) Data insertion fails because of commitlog rename failure MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4337: -------------------------------------- Attachment: 4337.txt It may be the CommitLogSegment mmap'd buffer preventing rename. Can you test the attached patch? > Data insertion fails because of commitlog rename failure > -------------------------------------------------------- > > Key: CASSANDRA-4337 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4337 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.1.1 > Environment: - Node 1: > Hardware: Intel Xeon 2.83 GHz (4 cores), 24GB RAM, Dell VIRTUAL DISK SCSI 500GB > System: Windows Server 2008 R2 x64 > Java version: 7 update 4 x64 > - Node 2: > Hardware: Intel Xeon 2.83 GHz (4 cores), 8GB RAM, Dell VIRTUAL DISK SCSI 500GB > System: Windows Server 2008 R2 x64 > Java version: 7 update 4 x64 > Reporter: Patrycjusz Matuszak > Assignee: Jonathan Ellis > Labels: commitlog > Fix For: 1.1.3 > > Attachments: 4337.txt, system-node1-stress-test.log, system-node1.log, system-node2-stress-test.log, system-node2.log > > > h3. Configuration > Cassandra server configuration: > {noformat}heap size: 4 GB > seed_provider: > - class_name: org.apache.cassandra.locator.SimpleSeedProvider > parameters: > - seeds: "xxx.xxx.xxx.10,xxx.xxx.xxx.11" > listen_address: xxx.xxx.xxx.10 > rpc_address: 0.0.0.0 > rpc_port: 9160 > rpc_timeout_in_ms: 20000 > endpoint_snitch: PropertyFileSnitch{noformat} > cassandra-topology.properties > {noformat}xxx.xxx.xxx.10=datacenter1:rack1 > xxx.xxx.xxx.11=datacenter1:rack1 > default=datacenter1:rack1{noformat} > Ring configuration: > {noformat}Address DC Rack Status State Load Effective-Ownership Token > 85070591730234615865843651857942052864 > xxx.xxx.xxx.10 datacenter1 rack1 Up Normal 23,11 kB 100,00% 0 > xxx.xxx.xxx.11 datacenter1 rack1 Up Normal 23,25 kB 100,00% 85070591730234615865843651857942052864{noformat} > h3.Problem > I have ctreated keyspace and column family using CLI commands: > {noformat}create keyspace testks with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options = {datacenter1:2}; > use testks; > create column family testcf;{noformat} > Then I started my Java application, which inserts 50 000 000 rows to created column family using Hector client. Client is connected to node 1. > After about 30 seconds (160 000 rows were inserted) Cassandra server on node 1 throws an exception: > {noformat}ERROR [COMMIT-LOG-ALLOCATOR] 2012-06-13 10:26:38,393 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] > java.io.IOError: java.io.IOException: Rename from c:\apache-cassandra\storage\commitlog\CommitLog-7345742389552.log to 7475933520374 failed > at org.apache.cassandra.db.commitlog.CommitLogSegment.(CommitLogSegment.java:127) > at org.apache.cassandra.db.commitlog.CommitLogSegment.recycle(CommitLogSegment.java:204) > at org.apache.cassandra.db.commitlog.CommitLogAllocator$2.run(CommitLogAllocator.java:166) > at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.io.IOException: Rename from c:\apache-cassandra\storage\commitlog\CommitLog-7345742389552.log to 7475933520374 failed > at org.apache.cassandra.db.commitlog.CommitLogSegment.(CommitLogSegment.java:105) > ... 5 more{noformat} > > Then, few seconds later Cassandra server on node 2 throws the same exception: > {noformat}ERROR [COMMIT-LOG-ALLOCATOR] 2012-06-14 10:26:44,005 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] > java.io.IOError: java.io.IOException: Rename from c:\apache-cassandra\storage\commitlog\CommitLog-7320337904033.log to 7437675489307 failed > at org.apache.cassandra.db.commitlog.CommitLogSegment.(CommitLogSegment.java:127) > at org.apache.cassandra.db.commitlog.CommitLogSegment.recycle(CommitLogSegment.java:204) > at org.apache.cassandra.db.commitlog.CommitLogAllocator$2.run(CommitLogAllocator.java:166) > at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > at java.lang.Thread.run(Unknown Source) > Caused by: java.io.IOException: Rename from c:\apache-cassandra\storage\commitlog\CommitLog-7320337904033.log to 7437675489307 failed > at org.apache.cassandra.db.commitlog.CommitLogSegment.(CommitLogSegment.java:105) > ... 5 more{noformat} > After that, my application cannot insert any more data. Hector gets TimedOutException from Thrift: > {noformat}Thread-4 HConnectionManager.java 306 2012-06-14 10:26:56,034 HConnectionManager operateWithFailover WARN %Could not fullfill request on this host CassandraClient > Thread-4 HConnectionManager.java 307 2012-06-14 10:26:56,034 HConnectionManager operateWithFailover WARN %Exception: > me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException() > at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:35) > at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:264) > at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97) > at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) > at patrycjusz.nosqltest.db.cassandra.CassandraHectorDbAdapter.commitTransaction(CassandraDbAdapter.java:63) > at patrycjusz.nosqltest.DbTest.insertData(DbTest.java:459) > at patrycjusz.nosqltest.gui.InsertPanel.executeTask(NePanel.java:154) > at patrycjusz.nosqltest.gui.InsertPanel$1.run(NePanel.java:141) > at java.lang.Thread.run(Unknown Source) > Caused by: TimedOutException() > at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20269) > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:922) > at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:908) > at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) > at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243) > at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103) > at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) > ... 8 more{noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira