Disk utilization is actually about 80% higher than what is reported for nodetool ring across all my nodes on the data drive

 

Bryce Godfrey | Sr. Software Engineer | Azaleos Corporation | T: 206.926.1978 | M: 206.849.2477

 

From: Dan Hendry [mailto:dan.hendry.junk@gmail.com]
Sent: Thursday, November 03, 2011 11:47 AM
To: user@cassandra.apache.org
Subject: RE: Problem after upgrade to 1.0.1

 

Regarding load growth, presumably you are referring to the load as reported by JMX/nodetool. Have you actually looked at the disk utilization on the nodes themselves? Potential issue I have seen: http://www.mail-archive.com/user@cassandra.apache.org/msg18142.html

 

Dan

 

From: Bryce Godfrey [mailto:Bryce.Godfrey@azaleos.com]
Sent: November-03-11 14:40
To: user@cassandra.apache.org
Subject: Problem after upgrade to 1.0.1

 

I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go just fine with the rolling upgrade.  But now I’m having extreme load growth on one of my nodes (and others are growing faster than usual also).  I attempted to run a cfstats against the extremely large node that was seeing 2x the load of others and I get this error below.  I’m also went into the o.a.c.db.HintedHandoffManager mbean and attempted to list pending hints to see if it was growing out of control for some reason, but that just times out eventually for any node.  I’m not sure what to do next with this issue.

 

               Column Family: HintsColumnFamily

                SSTable count: 3

                Space used (live): 12681676437

                Space used (total): 10233130272

                Number of Keys (estimate): 384

                Memtable Columns Count: 117704

                Memtable Data Size: 115107307

                Memtable Switch Count: 66

                Read Count: 0

                Read Latency: NaN ms.

                Write Count: 21203290

                Write Latency: 0.014 ms.

                Pending Tasks: 0

                Key cache capacity: 3

                Key cache size: 0

                Key cache hit rate: NaN

                Row cache: disabled

                Compacted row minimum size: 30130993

                Compacted row maximum size: 9223372036854775807

Exception in thread "main" java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed

        at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)

        at org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)

        at org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:293)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)

        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)

        at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)

        at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)

        at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)

        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)

        at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)

        at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)

        at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)

        at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)

        at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)

        at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)

        at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)

        at sun.rmi.transport.Transport$1.run(Transport.java:159)

        at java.security.AccessController.doPrivileged(Native Method)

        at sun.rmi.transport.Transport.serviceCall(Transport.java:155)

        at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)

        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)

        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)

        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

        at java.lang.Thread.run(Thread.java:662)

 

Bryce Godfrey | Sr. Software Engineer | Azaleos Corporation | T: 206.926.1978 | M: 206.849.2477

 

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.920 / Virus Database: 271.1.1/3993 - Release Date: 11/03/11 03:39:00