incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Günter Ladwig <guenter.lad...@kit.edu>
Subject Re: nodetool cfstats on 1.0.0-rc1 throws an exception
Date Tue, 11 Oct 2011 21:43:42 GMT
Hi all,

I'm seeing the same problem on my 1.0.0-rc2 cluster. However, I do not have 5000, but just
three (compressed) CFs.

The exception does not happen for the Migrations CF, but for one of mine:

Keyspace: KeyspaceCumulus
        Read Count: 816
        Read Latency: 8.926029411764706 ms.
        Write Count: 16808336
        Write Latency: 0.03914435902518846 ms.
        Pending Tasks: 0
                Column Family: OSP
                SSTable count: 22
                Space used (live): 22319610951
                Space used (total): 22227585112
                Number of Keys (estimate): 87322624
                Memtable Columns Count: 56028
                Memtable Data Size: 54362270
                Memtable Switch Count: 154
                Read Count: 277
                Read Latency: NaN ms.
                Write Count: 10913659
                Write Latency: NaN ms.
                Pending Tasks: 0
                Key cache: disabled
                Row cache: disabled
                Compacted row minimum size: 125
                Compacted row maximum size: 9223372036854775807
Exception in thread "main" java.lang.IllegalStateException: Unable to compute ceiling for
max when histogram overflowed
        at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)
        at org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)
        at org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:275)
        [...snip…]

I also had a look at the stats using JMX. The other CFs work fine, the only problem seems
to be this one. In JMX it shows 'Unavailable' for the row mean size and also that ridiculous
value for the max size.

The cluster consists of 15 nodes. The keyspace has three CFs (SPO, OSP and POS) of which only
two contain any data (POS is empty), and uses replication factor 2. In total, there are about
2 billion columns in each CF. The data distribution is different between the two CFs. The
row sizes for SPO should be fairly evenly distributed whereas OSP will have a few very wide
rows and a large number of small rows. 

Here is the output from describe:

Keyspace: KeyspaceCumulus:                                                               
                                                                                         
         Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
  Durable Writes: true
    Options: [replication_factor:2]
  Column Families:
    ColumnFamily: OSP
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
      Row cache size / save period in seconds / keys to save : 0.0/0/all
      Key cache size / save period in seconds: 0.0/0
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.0
      Replicate on write: false
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: POS
      Key Validation Class: org.apache.cassandra.db.marshal.BytesType
      Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
      Row cache size / save period in seconds / keys to save : 0.0/0/all
      Key cache size / save period in seconds: 0.0/0
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.0
      Replicate on write: false
      Built indexes: [POS.index_p]
      Column Metadata:
        Column Name: !o
          Validation Class: org.apache.cassandra.db.marshal.UTF8Type
        Column Name: !p
          Validation Class: org.apache.cassandra.db.marshal.UTF8Type
          Index Name: index_p
          Index Type: KEYS
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: SPO
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
      Row cache size / save period in seconds / keys to save : 0.0/0/all
      Key cache size / save period in seconds: 0.0/0
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.0
      Replicate on write: false
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor

If you need additional information, let me know.

Cheers,
Günter

On 04.10.2011, at 10:20, aaron morton wrote:

> That row has a size of 819 peta bytes, so something is odd there. The error is a result
of that value been so huge. When you rant he same script on 0.8.6 what was the max size of
the Migrations CF ?
> 
> As Jonathan says, it's unlikely anyone would have tested creating 5000 CF's. Most people
only create a few 10's of CF's at most.
> 
> either use fewer CF's or…
> 
> * dump the Migrations CF using sstable2json to take a look around 
> * work out steps to reproduce and report it on Jira
> 
> Hope that helps. 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 4/10/2011, at 11:30 AM, Ramesh Natarajan wrote:
> 
>> We recreated the schema using the same input file on both clusters and they are running
identical load.  
>> 
>> Isn't the exception thrown in the system CF?
>> 
>> this line looks strange:
>> 
>> Compacted row maximum size: 9223372036854775807
>> 
>> thanks
>> Ramesh
>> 
>> On Mon, Oct 3, 2011 at 5:26 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> Looks like you have unexpectedly large rows in your 1.0 cluster but
>> not 0.8.  I guess you could use sstable2json to manually check your
>> row sizes.
>> 
>> On Mon, Oct 3, 2011 at 5:20 PM, Ramesh Natarajan <ramesh25@gmail.com> wrote:
>> > It happens all the time on 1.0. It doesn't happen on 0.8.6.  Is there any
>> > thing I can do to check?
>> > thanks
>> > Ramesh
>> >
>> > On Mon, Oct 3, 2011 at 5:15 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> >>
>> >> My suspicion would be that it has more to do with "rare case when
>> >> running with 5000 CFs" than "1.0 regression."
>> >>
>> >> On Mon, Oct 3, 2011 at 5:00 PM, Ramesh Natarajan <ramesh25@gmail.com>
>> >> wrote:
>> >> > We have about 5000 column family and when we run the nodetool cfstats
it
>> >> > throws out this exception...  this is running 1.0.0-rc1
>> >> > This seems to work on 0.8.6.  Is this a bug in 1.0.0?
>> >> >
>> >> > thanks
>> >> > Ramesh
>> >> > Keyspace: system
>> >> >         Read Count: 28
>> >> >         Read Latency: 5.8675 ms.
>> >> >         Write Count: 3
>> >> >         Write Latency: 0.166 ms.
>> >> >         Pending Tasks: 0
>> >> >                 Column Family: Schema
>> >> >                 SSTable count: 4
>> >> >                 Space used (live): 4293758276
>> >> >                 Space used (total): 4293758276
>> >> >                 Number of Keys (estimate): 5376
>> >> >                 Memtable Columns Count: 0
>> >> >                 Memtable Data Size: 0
>> >> >                 Memtable Switch Count: 0
>> >> >                 Read Count: 3
>> >> >                 Read Latency: NaN ms.
>> >> >                 Write Count: 0
>> >> >                 Write Latency: NaN ms.
>> >> >                 Pending Tasks: 0
>> >> >                 Key cache capacity: 53
>> >> >                 Key cache size: 2
>> >> >                 Key cache hit rate: NaN
>> >> >                 Row cache: disabled
>> >> >                 Compacted row minimum size: 104
>> >> >                 Compacted row maximum size: 1955666
>> >> >                 Compacted row mean size: 1508515
>> >> >                 Column Family: HintsColumnFamily
>> >> >                 SSTable count: 0
>> >> >                 Space used (live): 0
>> >> >                 Space used (total): 0
>> >> >                 Number of Keys (estimate): 0
>> >> >                 Memtable Columns Count: 0
>> >> >                 Memtable Data Size: 0
>> >> >                 Memtable Switch Count: 0
>> >> >                 Read Count: 5
>> >> >                 Read Latency: NaN ms.
>> >> >                 Write Count: 0
>> >> >                 Write Latency: NaN ms.
>> >> >                 Pending Tasks: 0
>> >> >                 Key cache capacity: 1
>> >> >                 Key cache size: 0
>> >> >                 Key cache hit rate: NaN
>> >> >                 Row cache: disabled
>> >> >                 Compacted row minimum size: 0
>> >> >                 Compacted row maximum size: 0
>> >> >                 Compacted row mean size: 0
>> >> >                 Column Family: LocationInfo
>> >> >                 SSTable count: 1
>> >> >                 Space used (live): 6947
>> >> >                 Space used (total): 6947
>> >> >                 Number of Keys (estimate): 128
>> >> >                 Memtable Columns Count: 0
>> >> >                 Memtable Data Size: 0
>> >> >                 Memtable Switch Count: 2
>> >> >                 Read Count: 20
>> >> >                 Read Latency: NaN ms.
>> >> >                 Write Count: 3
>> >> >                 Write Latency: NaN ms.
>> >> >                 Pending Tasks: 0
>> >> >                 Key cache capacity: 1
>> >> >                 Key cache size: 1
>> >> >                 Key cache hit rate: NaN
>> >> >                 Row cache: disabled
>> >> >                 Compacted row minimum size: 73
>> >> >                 Compacted row maximum size: 258
>> >> >                 Compacted row mean size: 185
>> >> >                 Column Family: Migrations
>> >> >                 SSTable count: 4
>> >> >                 Space used (live): 4315909643
>> >> >                 Space used (total): 4315909643
>> >> >                 Number of Keys (estimate): 512
>> >> >                 Memtable Columns Count: 0
>> >> >                 Memtable Data Size: 0
>> >> >                 Memtable Switch Count: 0
>> >> >                 Read Count: 0
>> >> >                 Read Latency: NaN ms.
>> >> >                 Write Count: 0
>> >> >                 Write Latency: NaN ms.
>> >> >                 Pending Tasks: 0
>> >> >                 Key cache capacity: 5
>> >> >                 Key cache size: 0
>> >> >                 Key cache hit rate: NaN
>> >> >                 Row cache: disabled
>> >> >                 Compacted row minimum size: 5839589
>> >> >                 Compacted row maximum size: 9223372036854775807
>> >> > Exception in thread "main" java.lang.IllegalStateException: Unable
to
>> >> > compute ceiling for max when histogram overflowed
>> >> >         at
>> >> >
>> >> > org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)
>> >> >         at
>> >> > org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)
>> >> >         at
>> >> >
>> >> > org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:275)
>> >> >         at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
>> >> >         at
>> >> >
>> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >> >         at java.lang.reflect.Method.invoke(Method.java:597)
>> >> >         at
>> >> >
>> >> > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
>> >> >         at
>> >> >
>> >> > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
>> >> >         at
>> >> >
>> >> > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
>> >> >         at
>> >> > com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
>> >> >         at
>> >> > com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
>> >> >         at
>> >> >
>> >> > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
>> >> >         at
>> >> >
>> >> > com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
>> >> >         at
>> >> >
>> >> > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
>> >> >         at
>> >> >
>> >> > javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
>> >> >         at
>> >> >
>> >> > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
>> >> >         at
>> >> >
>> >> > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
>> >> >         at
>> >> >
>> >> > javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
>> >> >         at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source)
>> >> >         at
>> >> >
>> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >> >         at java.lang.reflect.Method.invoke(Method.java:597)
>> >> >         at
>> >> > sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
>> >> >         at sun.rmi.transport.Transport$1.run(Transport.java:159)
>> >> >         at java.security.AccessController.doPrivileged(Native Method)
>> >> >         at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
>> >> >         at
>> >> > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
>> >> >         at
>> >> >
>> >> > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
>> >> >         at
>> >> >
>> >> > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
>> >> >         at
>> >> >
>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >> >         at
>> >> >
>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >> >         at java.lang.Thread.run(Thread.java:662)
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jonathan Ellis
>> >> Project Chair, Apache Cassandra
>> >> co-founder of DataStax, the source for professional Cassandra support
>> >> http://www.datastax.com
>> >
>> >
>> 
>> 
>> 
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>> 
> 

--  
Dipl.-Inform. Günter Ladwig

Karlsruhe Institute of Technology (KIT)
Institute AIFB

Englerstraße 11 (Building 11.40, Room 250)
76131 Karlsruhe, Germany
Phone: +49 721 608-47946
Email: guenter.ladwig@kit.edu
Web: www.aifb.kit.edu

KIT – University of the State of Baden-Württemberg and National Large-scale Research Center
of the Helmholtz Association


Mime
View raw message