incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Increasing thrift_framed_transport_size_in_mb
Date Sun, 25 Sep 2011 09:02:44 GMT
Some discussion of large data here http://wiki.apache.org/cassandra/LargeDataSetConsiderations

When creating large rows you also need to be aware of in_memory_compaction_limit_in_mb (see
the yaml) and that all columns for a row are stored on the same node. So if you store one
file in a one row you may not get the best load distribution. 

I've heard mention before that 10MB is a reasonable max for a row if you have no natural partitions.


That said CFS in Brisk put each block on a row, and used columns for the sub blocks. And the
default settings for HFS are 

 <!-- 64 MB default --> 
<property>
  <name>fs.local.block.size</name>
  <value>67108864</value> 
</property>

<!-- 2 MB SubBlock Size -->
<property>
  <name>fs.local.subblock.size</name>
  <value>2097152</value> 
</property>

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 24/09/2011, at 9:27 PM, Radim Kolar wrote:

> Dne 24.9.2011 0:05, Jonathan Ellis napsal(a):
>> Really large messages are not encouraged because they will fragment
>> your heap quickly.  Other than that, no.
> what is recommended chunk size for storing multi gigabyte files in cassandra? 64MB is
okay or its too large?


Mime
View raw message