incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: nodetool cfstats on 1.0.0-rc1 throws an exception
Date Wed, 12 Oct 2011 14:28:47 GMT
Try scrubbing the CF ("nodetool scrub") and see if that fixes it.

If not, then at least we have a reproducible problem. :)

On Tue, Oct 11, 2011 at 4:43 PM, Günter Ladwig <guenter.ladwig@kit.edu> wrote:
> Hi all,
>
> I'm seeing the same problem on my 1.0.0-rc2 cluster. However, I do not have 5000, but
just three (compressed) CFs.
>
> The exception does not happen for the Migrations CF, but for one of mine:
>
> Keyspace: KeyspaceCumulus
>        Read Count: 816
>        Read Latency: 8.926029411764706 ms.
>        Write Count: 16808336
>        Write Latency: 0.03914435902518846 ms.
>        Pending Tasks: 0
>                Column Family: OSP
>                SSTable count: 22
>                Space used (live): 22319610951
>                Space used (total): 22227585112
>                Number of Keys (estimate): 87322624
>                Memtable Columns Count: 56028
>                Memtable Data Size: 54362270
>                Memtable Switch Count: 154
>                Read Count: 277
>                Read Latency: NaN ms.
>                Write Count: 10913659
>                Write Latency: NaN ms.
>                Pending Tasks: 0
>                Key cache: disabled
>                Row cache: disabled
>                Compacted row minimum size: 125
>                Compacted row maximum size: 9223372036854775807
> Exception in thread "main" java.lang.IllegalStateException: Unable to compute ceiling
for max when histogram overflowed
>        at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)
>        at org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)
>        at org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:275)
>        [...snip…]
>
> I also had a look at the stats using JMX. The other CFs work fine, the only problem seems
to be this one. In JMX it shows 'Unavailable' for the row mean size and also that ridiculous
value for the max size.
>
> The cluster consists of 15 nodes. The keyspace has three CFs (SPO, OSP and POS) of which
only two contain any data (POS is empty), and uses replication factor 2. In total, there are
about 2 billion columns in each CF. The data distribution is different between the two CFs.
The row sizes for SPO should be fairly evenly distributed whereas OSP will have a few very
wide rows and a large number of small rows.
>
> Here is the output from describe:
>
> Keyspace: KeyspaceCumulus:                                        
                                                           
                                                           
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>  Durable Writes: true
>    Options: [replication_factor:2]
>  Column Families:
>    ColumnFamily: OSP
>      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>      Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
>      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>      Row cache size / save period in seconds / keys to save : 0.0/0/all
>      Key cache size / save period in seconds: 0.0/0
>      GC grace seconds: 0
>      Compaction min/max thresholds: 4/32
>      Read repair chance: 0.0
>      Replicate on write: false
>      Built indexes: []
>      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>      Compression Options:
>        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
>    ColumnFamily: POS
>      Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>      Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
>      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>      Row cache size / save period in seconds / keys to save : 0.0/0/all
>      Key cache size / save period in seconds: 0.0/0
>      GC grace seconds: 0
>      Compaction min/max thresholds: 4/32
>      Read repair chance: 0.0
>      Replicate on write: false
>      Built indexes: [POS.index_p]
>      Column Metadata:
>        Column Name: !o
>          Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>        Column Name: !p
>          Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>          Index Name: index_p
>          Index Type: KEYS
>      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>      Compression Options:
>        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
>    ColumnFamily: SPO
>      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>      Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
>      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>      Row cache size / save period in seconds / keys to save : 0.0/0/all
>      Key cache size / save period in seconds: 0.0/0
>      GC grace seconds: 0
>      Compaction min/max thresholds: 4/32
>      Read repair chance: 0.0
>      Replicate on write: false
>      Built indexes: []
>      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>      Compression Options:
>        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
>
> If you need additional information, let me know.
>
> Cheers,
> Günter
>
> On 04.10.2011, at 10:20, aaron morton wrote:
>
>> That row has a size of 819 peta bytes, so something is odd there. The error is a
result of that value been so huge. When you rant he same script on 0.8.6 what was the max
size of the Migrations CF ?
>>
>> As Jonathan says, it's unlikely anyone would have tested creating 5000 CF's. Most
people only create a few 10's of CF's at most.
>>
>> either use fewer CF's or…
>>
>> * dump the Migrations CF using sstable2json to take a look around
>> * work out steps to reproduce and report it on Jira
>>
>> Hope that helps.
>>
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 4/10/2011, at 11:30 AM, Ramesh Natarajan wrote:
>>
>>> We recreated the schema using the same input file on both clusters and they are
running identical load.
>>>
>>> Isn't the exception thrown in the system CF?
>>>
>>> this line looks strange:
>>>
>>> Compacted row maximum size: 9223372036854775807
>>>
>>> thanks
>>> Ramesh
>>>
>>> On Mon, Oct 3, 2011 at 5:26 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>> Looks like you have unexpectedly large rows in your 1.0 cluster but
>>> not 0.8.  I guess you could use sstable2json to manually check your
>>> row sizes.
>>>
>>> On Mon, Oct 3, 2011 at 5:20 PM, Ramesh Natarajan <ramesh25@gmail.com> wrote:
>>> > It happens all the time on 1.0. It doesn't happen on 0.8.6.  Is there any
>>> > thing I can do to check?
>>> > thanks
>>> > Ramesh
>>> >
>>> > On Mon, Oct 3, 2011 at 5:15 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>> >>
>>> >> My suspicion would be that it has more to do with "rare case when
>>> >> running with 5000 CFs" than "1.0 regression."
>>> >>
>>> >> On Mon, Oct 3, 2011 at 5:00 PM, Ramesh Natarajan <ramesh25@gmail.com>
>>> >> wrote:
>>> >> > We have about 5000 column family and when we run the nodetool cfstats
it
>>> >> > throws out this exception...  this is running 1.0.0-rc1
>>> >> > This seems to work on 0.8.6.  Is this a bug in 1.0.0?
>>> >> >
>>> >> > thanks
>>> >> > Ramesh
>>> >> > Keyspace: system
>>> >> >         Read Count: 28
>>> >> >         Read Latency: 5.8675 ms.
>>> >> >         Write Count: 3
>>> >> >         Write Latency: 0.166 ms.
>>> >> >         Pending Tasks: 0
>>> >> >                 Column Family: Schema
>>> >> >                 SSTable count: 4
>>> >> >                 Space used (live): 4293758276
>>> >> >                 Space used (total): 4293758276
>>> >> >                 Number of Keys (estimate): 5376
>>> >> >                 Memtable Columns Count: 0
>>> >> >                 Memtable Data Size: 0
>>> >> >                 Memtable Switch Count: 0
>>> >> >                 Read Count: 3
>>> >> >                 Read Latency: NaN ms.
>>> >> >                 Write Count: 0
>>> >> >                 Write Latency: NaN ms.
>>> >> >                 Pending Tasks: 0
>>> >> >                 Key cache capacity: 53
>>> >> >                 Key cache size: 2
>>> >> >                 Key cache hit rate: NaN
>>> >> >                 Row cache: disabled
>>> >> >                 Compacted row minimum size: 104
>>> >> >                 Compacted row maximum size: 1955666
>>> >> >                 Compacted row mean size: 1508515
>>> >> >                 Column Family: HintsColumnFamily
>>> >> >                 SSTable count: 0
>>> >> >                 Space used (live): 0
>>> >> >                 Space used (total): 0
>>> >> >                 Number of Keys (estimate): 0
>>> >> >                 Memtable Columns Count: 0
>>> >> >                 Memtable Data Size: 0
>>> >> >                 Memtable Switch Count: 0
>>> >> >                 Read Count: 5
>>> >> >                 Read Latency: NaN ms.
>>> >> >                 Write Count: 0
>>> >> >                 Write Latency: NaN ms.
>>> >> >                 Pending Tasks: 0
>>> >> >                 Key cache capacity: 1
>>> >> >                 Key cache size: 0
>>> >> >                 Key cache hit rate: NaN
>>> >> >                 Row cache: disabled
>>> >> >                 Compacted row minimum size: 0
>>> >> >                 Compacted row maximum size: 0
>>> >> >                 Compacted row mean size: 0
>>> >> >                 Column Family: LocationInfo
>>> >> >                 SSTable count: 1
>>> >> >                 Space used (live): 6947
>>> >> >                 Space used (total): 6947
>>> >> >                 Number of Keys (estimate): 128
>>> >> >                 Memtable Columns Count: 0
>>> >> >                 Memtable Data Size: 0
>>> >> >                 Memtable Switch Count: 2
>>> >> >                 Read Count: 20
>>> >> >                 Read Latency: NaN ms.
>>> >> >                 Write Count: 3
>>> >> >                 Write Latency: NaN ms.
>>> >> >                 Pending Tasks: 0
>>> >> >                 Key cache capacity: 1
>>> >> >                 Key cache size: 1
>>> >> >                 Key cache hit rate: NaN
>>> >> >                 Row cache: disabled
>>> >> >                 Compacted row minimum size: 73
>>> >> >                 Compacted row maximum size: 258
>>> >> >                 Compacted row mean size: 185
>>> >> >                 Column Family: Migrations
>>> >> >                 SSTable count: 4
>>> >> >                 Space used (live): 4315909643
>>> >> >                 Space used (total): 4315909643
>>> >> >                 Number of Keys (estimate): 512
>>> >> >                 Memtable Columns Count: 0
>>> >> >                 Memtable Data Size: 0
>>> >> >                 Memtable Switch Count: 0
>>> >> >                 Read Count: 0
>>> >> >                 Read Latency: NaN ms.
>>> >> >                 Write Count: 0
>>> >> >                 Write Latency: NaN ms.
>>> >> >                 Pending Tasks: 0
>>> >> >                 Key cache capacity: 5
>>> >> >                 Key cache size: 0
>>> >> >                 Key cache hit rate: NaN
>>> >> >                 Row cache: disabled
>>> >> >                 Compacted row minimum size: 5839589
>>> >> >                 Compacted row maximum size: 9223372036854775807
>>> >> > Exception in thread "main" java.lang.IllegalStateException: Unable
to
>>> >> > compute ceiling for max when histogram overflowed
>>> >> >         at
>>> >> >
>>> >> > org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)
>>> >> >         at
>>> >> > org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)
>>> >> >         at
>>> >> >
>>> >> > org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:275)
>>> >> >         at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown
Source)
>>> >> >         at
>>> >> >
>>> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >> >         at java.lang.reflect.Method.invoke(Method.java:597)
>>> >> >         at
>>> >> >
>>> >> > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
>>> >> >         at
>>> >> >
>>> >> > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
>>> >> >         at
>>> >> >
>>> >> > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
>>> >> >         at
>>> >> > com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
>>> >> >         at
>>> >> > com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
>>> >> >         at
>>> >> >
>>> >> > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
>>> >> >         at
>>> >> >
>>> >> > com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
>>> >> >         at
>>> >> >
>>> >> > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
>>> >> >         at
>>> >> >
>>> >> > javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
>>> >> >         at
>>> >> >
>>> >> > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
>>> >> >         at
>>> >> >
>>> >> > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
>>> >> >         at
>>> >> >
>>> >> > javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
>>> >> >         at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown
Source)
>>> >> >         at
>>> >> >
>>> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >> >         at java.lang.reflect.Method.invoke(Method.java:597)
>>> >> >         at
>>> >> > sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
>>> >> >         at sun.rmi.transport.Transport$1.run(Transport.java:159)
>>> >> >         at java.security.AccessController.doPrivileged(Native
Method)
>>> >> >         at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
>>> >> >         at
>>> >> > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
>>> >> >         at
>>> >> >
>>> >> > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
>>> >> >         at
>>> >> >
>>> >> > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
>>> >> >         at
>>> >> >
>>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> >> >         at
>>> >> >
>>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> >> >         at java.lang.Thread.run(Thread.java:662)
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Jonathan Ellis
>>> >> Project Chair, Apache Cassandra
>>> >> co-founder of DataStax, the source for professional Cassandra support
>>> >> http://www.datastax.com
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>
> --
> Dipl.-Inform. Günter Ladwig
>
> Karlsruhe Institute of Technology (KIT)
> Institute AIFB
>
> Englerstraße 11 (Building 11.40, Room 250)
> 76131 Karlsruhe, Germany
> Phone: +49 721 608-47946
> Email: guenter.ladwig@kit.edu
> Web: www.aifb.kit.edu
>
> KIT – University of the State of Baden-Württemberg and National Large-scale Research
Center of the Helmholtz Association
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message