incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Franc Carter <franc.car...@sirca.org.au>
Subject Re: Large number of files for Leveled Compaction
Date Tue, 18 Jun 2013 12:57:03 GMT
On Mon, Jun 17, 2013 at 3:37 PM, Franc Carter <franc.carter@sirca.org.au>wrote:

> On Mon, Jun 17, 2013 at 3:28 PM, Wei Zhu <wz1975@yahoo.com> wrote:
>
>> default value of 5MB is way too small in practice. Too many files in one
>> directory is not a good thing. It's not clear what should be a good number.
>> I have heard people are using 50MB, 75MB, even 100MB. Do your own test o
>> find a "right" number.
>>
>
> Interesting - 50MB is the low end of what people are using - 5MB is a lot
> lower. I'll try a 50MB set
>

Oops, forgot to ask - is there a way to get Cassandra to rebuild the
sstables as bigger once I have updated the column family definition ?

thanks


>
> cheers
>
>
>> -Wei
>>
>> ------------------------------
>> *From: *"Franc Carter" <franc.carter@sirca.org.au>
>> *To: *user@cassandra.apache.org
>> *Sent: *Sunday, June 16, 2013 10:15:22 PM
>> *Subject: *Re: Large number of files for Leveled Compaction
>>
>>
>>
>>
>> On Mon, Jun 17, 2013 at 2:59 PM, Manoj Mainali <mainalimanoj@gmail.com>wrote:
>>
>>> Not in the case of LeveledCompaction. Only SizeTieredCompaction merges
>>> smaller sstables into large ones. With the LeveledCompaction, the sstables
>>> are always of fixed size but they are grouped into different levels.
>>>
>>> You can refer to this page
>>> http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra on
>>> details of how LeveledCompaction works.
>>>
>>>
>> Yes, but it seems I've misinterpreted that page ;-(
>>
>> I took this paragraph
>>
>> In figure 3, new sstables are added to the first level, L0, and
>>> immediately compacted with the sstables in L1 (blue). When L1 fills up,
>>> extra sstables are promoted to L2 (violet). Subsequent sstables generated
>>> in L1 will be compacted with the sstables in L2 with which they overlap. As
>>> more data is added, leveled compaction results in a situation like the one
>>> shown in figure 4.
>>>
>>
>> to mean that once a level fills up it gets compacted into a higher level
>>
>> cheers
>>
>>
>>> Cheers
>>> Manoj
>>>
>>>
>>> On Mon, Jun 17, 2013 at 1:54 PM, Franc Carter <franc.carter@sirca.org.au
>>> > wrote:
>>>
>>>> On Mon, Jun 17, 2013 at 2:47 PM, Manoj Mainali <mainalimanoj@gmail.com>wrote:
>>>>
>>>>> With LeveledCompaction, each sstable size is fixed and is defined by
>>>>> sstable_size_in_mb in the compaction configuration of CF definition and
>>>>> default value is 5MB. In you case, you may have not defined your own
value,
>>>>> that is why your each sstable is 5MB. And if you dataset is huge, you
will
>>>>> see a lot of sstable counts.
>>>>>
>>>>
>>>>
>>>> Ok, seems like I do have (at least) an incomplete understanding. I
>>>> realise that the minimum size is 5MB, but I thought compaction would merge
>>>> these into a smaller number of larger sstables ?
>>>>
>>>> thanks
>>>>
>>>>
>>>>> Cheers
>>>>>
>>>>> Manoj
>>>>>
>>>>>
>>>>> On Fri, Jun 7, 2013 at 1:44 PM, Franc Carter <
>>>>> franc.carter@sirca.org.au> wrote:
>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We are trialling Cassandra-1.2(.4) with Leveled compaction as it
>>>>>> looks like it may be a win for us.
>>>>>>
>>>>>> The first step of testing was to push a fairly large slab of data
>>>>>> into the Column Family - we did this much faster (> x100) than
we would in
>>>>>> a production environment. This has left the Column Family with about
>>>>>> 140,000 files in the Column Family directory which seems way too
high. On
>>>>>> two of the nodes the CompactionStats show 2 outstanding tasks and
on a
>>>>>> third node there are over 13,000 outstanding tasks. However from
looking at
>>>>>> the log activity it looks like compaction has finished on all nodes.
>>>>>>
>>>>>> Is this number of files expected/normal ?
>>>>>>
>>>>>> cheers
>>>>>>
>>>>>> --
>>>>>>
>>>>>> *Franc Carter* | Systems architect | Sirca Ltd
>>>>>>  <marc.zianideferranti@sirca.org.au>
>>>>>>
>>>>>> franc.carter@sirca.org.au | www.sirca.org.au
>>>>>>
>>>>>> Tel: +61 2 8355 2514
>>>>>>
>>>>>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>>>>>
>>>>>> PO Box H58, Australia Square, Sydney NSW 1215
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> *Franc Carter* | Systems architect | Sirca Ltd
>>>>  <marc.zianideferranti@sirca.org.au>
>>>>
>>>> franc.carter@sirca.org.au | www.sirca.org.au
>>>>
>>>> Tel: +61 2 8355 2514
>>>>
>>>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>>>
>>>> PO Box H58, Australia Square, Sydney NSW 1215
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>>
>> *Franc Carter* | Systems architect | Sirca Ltd
>>  <marc.zianideferranti@sirca.org.au>
>>
>> franc.carter@sirca.org.au | www.sirca.org.au
>>
>> Tel: +61 2 8355 2514
>>
>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>
>> PO Box H58, Australia Square, Sydney NSW 1215
>>
>>
>>
>>
>
>
> --
>
> *Franc Carter* | Systems architect | Sirca Ltd
>  <marc.zianideferranti@sirca.org.au>
>
> franc.carter@sirca.org.au | www.sirca.org.au
>
> Tel: +61 2 8355 2514
>
> Level 4, 55 Harrington St, The Rocks NSW 2000
>
> PO Box H58, Australia Square, Sydney NSW 1215
>
>
>


-- 

*Franc Carter* | Systems architect | Sirca Ltd
 <marc.zianideferranti@sirca.org.au>

franc.carter@sirca.org.au | www.sirca.org.au

Tel: +61 2 8355 2514

Level 4, 55 Harrington St, The Rocks NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215

Mime
View raw message