cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Stanhope <pstanh...@wimba.com>
Subject Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
Date Tue, 15 Jun 2010 20:54:36 GMT
How are you doing your inserts?

I draw a clear line between 1) bootstrapping a cluster with data and 2) simulating expected/projected
read/write behavior.

If you are bootstrapping then I would look into the batch_mutate APIs. They allow you to improve
your performance on writes dramatically.

If you are read/write testing on a populated cluster, insert and batch_insert (for super columns)
are the way to go.

As Ben has pointed to me in numerous threads ... think carefully about your replication factor.
Do you want the data on all nodes? Or sufficiently replicated so that you can recover? Do
you want consistency at the time of write? Or eventually?

Cassandra has a bunch of knobs that you can turn ... but that flexibility requires that you
think about your expected usage patterns and operational policies.

-phil

On Jun 15, 2010, at 4:40 PM, Julie wrote:

> Benjamin Black <b <at> b3k.us> writes:
> 
>> 
>> You are likely exhausting your heap space (probably still at the very
>> small 1G default?), and maximizing the amount of resource consumption
>> by using CL.ALL.  Why are you using ALL?
>> 
>> On Tue, Jun 15, 2010 at 11:58 AM, Julie <julie.sugar <at> nextcentury.com>

> wrote:
> ...
>>> Coinciding with my write timeouts, all 10 of my cassandra servers are 
> getting
>>> the following exception written to system.log:
>>> 
>>> 
>>>  INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162)
>>> Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db
>>> ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145
>>> DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
>>> java.lang.RuntimeException: java.io.IOException: Value too large for defined
>>> data type
>>> at
>>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask
>>> (ThreadPoolExecutor.java:886)
>>>        at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:908)
>>>        at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.io.IOException: Value too large for defined data type
>>>        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>>        at
>>> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
>>>        at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
>>>        at
>>> org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
>>>        at
>>> org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
>>>        at
>>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>>        ... 3 more
> ...
> 
> 
> Thanks for your reply.  Yes, my heap space is 1G.  My vms have only 1.7G of 
> memory so I hesitate to use more.  I am using ALL because I was crashing 
> cassandra when I used ZERO (posting from a few days ago) with a heap space 
> error so it was recommended that I use ALL instead.  I also tried using ONE but 
> got even more write timeouts so I thought it would be safer to just wait for 
> ALL replications to be written before trying to write more rows. 
> 
> Thank you for your help.
> 
> 
> 


Mime
View raw message