cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Endless minor compactions after heavy inserts
Date Sun, 03 Apr 2011 18:04:52 GMT
On Sun, Apr 3, 2011 at 1:46 PM, Sheng Chen <chensheng2010@gmail.com> wrote:
> I think if i can keep a single sstable file in a proper size, the hot
> data/index files may be able to fit into memory at least in some occasions.
>
> In my use case, I want to use cassandra for storage of a large amount of log
> data.
> There will be multiple nodes, and each node has 10*2TB disks to hold as much
> data as possible, ideally 20TB (about 100 billion rows) in one node.
> Reading operations will be much less than writing. A reading latency within
> 1 second is acceptable.
>
> Is it possible? Do you have advice on this design?
> Thank you.
>
> Sheng
>
>
>
> 2011/4/3 aaron morton <aaron@thelastpickle.com>
>>
>> With only one data file your reads would use the least amount of IO to
>> find the data.
>> Most people have multiple nodes and probably fewer disks, so each node may
>> have a TB or two of data. How much capacity do your 10 disks give ? Will you
>> be running multiple nodes in production ?
>> Aaron
>>
>>
>> On 2 Apr 2011, at 12:45, Sheng Chen wrote:
>>
>> Thank you very much.
>> The major compaction will merge everything into one big file., which would
>> be very large.
>> Is there any way to control the number or size of files created by major
>> compaction?
>> Or, is there a recommended number or size of files for cassandra to
>> handle?
>> Thanks. I see the trigger of my minor compaction is OperationsInMillions.
>> It is a number of operations in total, which I thought was in a second.
>> Cheers,
>> Sheng
>>
>> 2011/4/1 aaron morton <aaron@thelastpickle.com>
>>>
>>> If you are doing some sort of bulk load you can disable minor compactions
>>> by setting the min_compaction_threshold and max_compaction_threshold to 0 .
>>> Then once your insert is complete run a major compaction via nodetool before
>>> turning the minor compaction back on.
>>>
>>> You can also reduce the compaction threads priority, see
>>> compaction_thread_priority in the yaml file.
>>>
>>> The memtable will be flushed when either the MB or ops throughput is
>>> triggered. If you are seeing a lot of memtables smaller than the MB
>>> threshold then the ops threshold is probably been triggered. Look for a log
>>> message at INFO level starting with "Enqueuing flush of Memtable" that will
>>> tell you how many bytes and ops the memtable had when it was flushed. Trying
>>> increasing the ops threshold and see what happens.
>>>
>>> You're change in the compaction threshold may not have an an effect
>>> because the compaction process was already running.
>>>
>>> AFAIK the best way to get the best out of your 10 disks will be to use a
>>> dedicated mirror for the commit log and a  stripe set for the data.
>>>
>>> Hope that helps.
>>> Aaron
>>>
>>> On 1 Apr 2011, at 14:52, Sheng Chen wrote:
>>>
>>> > I've got a single node of cassandra 0.7.4, and I used the java stress
>>> > tool to insert about 100 million records.
>>> > The inserts took about 6 hours (45k inserts/sec) but the following
>>> > minor compactions last for 2 days and the pending compaction jobs are still
>>> > increasing.
>>> >
>>> > From jconsole I can read the MemtableThroughputInMB=1499,
>>> > MemtableOperationsInMillions=7.0
>>> > But in my data directory, I got hundreds of 438MB data files, which
>>> > should be the cause of the minor compactions.
>>> >
>>> > I tried to set compaction threshold by nodetool, but it didn't seem to
>>> > take effects (no change in pending compaction tasks).
>>> > After restarting the node, my setting is lost.
>>> >
>>> > I want to distribute the read load in my disks (10 disks in xfs, LVM),
>>> > so I don't want to do a major compaction.
>>> > So, what can I do to keep the sstable file in a reasonable size, or to
>>> > make the minor compactions faster?
>>> >
>>> > Thank you in advance.
>>> > Sheng
>>> >
>>>
>>
>>
>
>

Consider implications of
http://wiki.apache.org/cassandra/LargeDataSetConsiderations

Mime
View raw message