Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC95A10BD5 for ; Tue, 18 Feb 2014 00:49:21 +0000 (UTC) Received: (qmail 1855 invoked by uid 500); 18 Feb 2014 00:49:20 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 1830 invoked by uid 500); 18 Feb 2014 00:49:19 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 1818 invoked by uid 99); 18 Feb 2014 00:49:19 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Feb 2014 00:49:19 +0000 Date: Tue, 18 Feb 2014 00:49:19 +0000 (UTC) From: "sankalp kohli (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-6716) nodetool scrub constantly fails with RuntimeException (Tried to hard link to file that does not exist) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6716?page=3Dcom.atlas= sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D= 13903654#comment-13903654 ]=20 sankalp kohli commented on CASSANDRA-6716: ------------------------------------------ I am seeing this CASSANDRA-6285 in your logs as well.=20 CassandraDaemon.java (line 192) Exception in thread Thread[CompactionExecut= or:25,1,main] java.lang.RuntimeException: Last written key DecoratedKey(40205208087521895= 97, 31302e332e34352e3136312d6765744e6f6e4865617055736564) >=3D current key = DecoratedKey(-2471509717181461453, 31302e332e34352e3135380b0f00000001000000= 04706470730c000000010c00010c00010b00010000003931302e332e34352e3135382d77696= e7465726d7574655f6a6d657465) writing into /mnt/disk2/cassandra/data/OpsCent= er/rollups60/OpsCenter-rollups60-tmp-jb-11559-Data.db =09at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWri= ter.java:142) =09at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.ja= va:165) =09at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionT= ask.java:160) =09at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareR= unnable.java:48) =09at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:2= 8) =09at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(Com= pactionTask.java:60) =09at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(Abs= tractCompactionTask.java:59) =09at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompac= tionTask.run(CompactionManager.java:197) =09at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:47= 1) =09at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) =09at java.util.concurrent.FutureTask.run(FutureTask.java:166) =09at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.= java:1145) =09at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor= .java:615) =09at java.lang.Thread.run(Thread.java:724) > nodetool scrub constantly fails with RuntimeException (Tried to hard link= to file that does not exist) > -------------------------------------------------------------------------= ----------------------------- > > Key: CASSANDRA-6716 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6716 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes,= JDK 1.7 > Reporter: Nikolai Grigoriev > Attachments: system.log.gz > > > It seems that since recently I have started getting a number of exception= s like "File not found" on all Cassandra nodes. Currently I am getting an e= xception like this every couple of seconds on each node, for different keys= paces and CFs. > I have tried to restart the nodes, tried to scrub them. No luck so far. I= t seems that scrub cannot complete on any of these nodes, at some point it = fails because of the file that it can't find. > One one of the nodes currently the "nodetool scrub" command fails instan= tly and consistently with this exception: > {code} > # /opt/cassandra/bin/nodetool scrub=20 > Exception in thread "main" java.lang.RuntimeException: Tried to hard link= to file that does not exist /mnt/disk5/cassandra/data/mykeyspace_jmeter/te= st_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db > =09at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.jav= a:75) > =09at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableRe= ader.java:1215) > =09at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(Colu= mnFamilyStore.java:1826) > =09at org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.j= ava:1122) > =09at org.apache.cassandra.service.StorageService.scrub(StorageService.ja= va:2159) > =09at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > =09at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImp= l.java:57) > =09at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcc= essorImpl.java:43) > =09at java.lang.reflect.Method.invoke(Method.java:606) > =09at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) > =09at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) > =09at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcc= essorImpl.java:43) > =09at java.lang.reflect.Method.invoke(Method.java:606) > =09at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) > =09at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Standard= MBeanIntrospector.java:112) > =09at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Standard= MBeanIntrospector.java:46) > =09at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector= .java:237) > =09at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) > =09at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) > =09at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Defaul= tMBeanServerInterceptor.java:819) > =09at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:8= 01) > =09at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnec= tionImpl.java:1487) > =09at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnect= ionImpl.java:97) > =09at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.r= un(RMIConnectionImpl.java:1328) > =09at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation= (RMIConnectionImpl.java:1420) > =09at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionI= mpl.java:848) > =09at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source) > =09at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcc= essorImpl.java:43) > =09at java.lang.reflect.Method.invoke(Method.java:606) > =09at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) > =09at sun.rmi.transport.Transport$1.run(Transport.java:177) > =09at sun.rmi.transport.Transport$1.run(Transport.java:174) > =09at java.security.AccessController.doPrivileged(Native Method) > =09at sun.rmi.transport.Transport.serviceCall(Transport.java:173) > =09at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java= :553) > =09at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransp= ort.java:808) > =09at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTranspo= rt.java:667) > =09at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecuto= r.java:1145) > =09at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecut= or.java:615) > =09at java.lang.Thread.run(Thread.java:724) > {code} > Also I have noticed that the files that are missing are often (or maybe a= lways?) referred to in the log as follows: > {quote} > WARN 00:06:10,597 At level 3, SSTableReader(path=3D'/mnt/disk5/cassandra= /data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-26= 776-Data.db') [DecoratedKey(-9053060597280257896, 0010f582cddaca974d7198ae3= 0f194ccfd0c00001000000000004b818d000000000000000100001000000000000040000000= 00000000000300), DecoratedKey(-8855915848970248008, 00103ce153dfeeb547fb881= a51adf611f6cf0000100000000000f04f470000000000000001000010000000000000400000= 0000000000000500)] overlaps SSTableReader(path=3D'/mnt/disk2/cassandra/data= /mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28022-D= ata.db') [DecoratedKey(-8964446543595889729, 001043214a8bdcfd46a3b8ea71da2d= 57bb9a0000100000000001117c0d00000000000000000000100000000000004000000000000= 000000100), DecoratedKey(-8848132752710859808, 0010d1f6de8039d54218bf5b1e18= 4335df5f000010000000000062e526000000000000000100001000000000000040000000000= 00000000400)]. This could be caused by a bug in Cassandra 1.1.0 .. 1.1.3 o= r due to the fact that you have dropped sstables from another node into the= data directory. Sending back to L0. If you didn't drop in sstables, and h= ave not yet run scrub, you should do so since you may also have rows out-of= -order within an sstable > WARN [RMI TCP Connection(2)-10.3.45.158] 2014-02-18 00:06:10,597 Leveled= Manifest.java (line 171) At level 3, SSTableReader(path=3D'/mnt/disk5/cassa= ndra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-j= b-26776-Data.db') [DecoratedKey(-9053060597280257896, 0010f582cddaca974d719= 8ae30f194ccfd0c00001000000000004b818d00000000000000010000100000000000004000= 000000000000000300), DecoratedKey(-8855915848970248008, 00103ce153dfeeb547f= b881a51adf611f6cf0000100000000000f04f47000000000000000100001000000000000040= 00000000000000000500)] overlaps SSTableReader(path=3D'/mnt/disk2/cassandra/= data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-280= 22-Data.db') [DecoratedKey(-8964446543595889729, 001043214a8bdcfd46a3b8ea71= da2d57bb9a0000100000000001117c0d0000000000000000000010000000000000400000000= 0000000000100), DecoratedKey(-8848132752710859808, 0010d1f6de8039d54218bf5b= 1e184335df5f000010000000000062e52600000000000000010000100000000000004000000= 000000000000400)]. This could be caused by a bug in Cassandra 1.1.0 .. 1.1= .3 or due to the fact that you have dropped sstables from another node into= the data directory. Sending back to L0. If you didn't drop in sstables, a= nd have not yet run scrub, you should do so since you may also have rows ou= t-of-order within an sstable > {quote} > I never had anything but Cassandra 2.0 on these systems. Also I have recr= eated my test data from scratch with 2.0.4. -- This message was sent by Atlassian JIRA (v6.1.5#6160)