incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Franc Carter <franc.car...@sirca.org.au>
Subject Re: Large number of files for Leveled Compaction
Date Mon, 17 Jun 2013 05:37:43 GMT
On Mon, Jun 17, 2013 at 3:28 PM, Wei Zhu <wz1975@yahoo.com> wrote:

> default value of 5MB is way too small in practice. Too many files in one
> directory is not a good thing. It's not clear what should be a good number.
> I have heard people are using 50MB, 75MB, even 100MB. Do your own test o
> find a "right" number.
>

Interesting - 50MB is the low end of what people are using - 5MB is a lot
lower. I'll try a 50MB set

cheers


> -Wei
>
> ------------------------------
> *From: *"Franc Carter" <franc.carter@sirca.org.au>
> *To: *user@cassandra.apache.org
> *Sent: *Sunday, June 16, 2013 10:15:22 PM
> *Subject: *Re: Large number of files for Leveled Compaction
>
>
>
>
> On Mon, Jun 17, 2013 at 2:59 PM, Manoj Mainali <mainalimanoj@gmail.com>wrote:
>
>> Not in the case of LeveledCompaction. Only SizeTieredCompaction merges
>> smaller sstables into large ones. With the LeveledCompaction, the sstables
>> are always of fixed size but they are grouped into different levels.
>>
>> You can refer to this page
>> http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra on
>> details of how LeveledCompaction works.
>>
>>
> Yes, but it seems I've misinterpreted that page ;-(
>
> I took this paragraph
>
> In figure 3, new sstables are added to the first level, L0, and
>> immediately compacted with the sstables in L1 (blue). When L1 fills up,
>> extra sstables are promoted to L2 (violet). Subsequent sstables generated
>> in L1 will be compacted with the sstables in L2 with which they overlap. As
>> more data is added, leveled compaction results in a situation like the one
>> shown in figure 4.
>>
>
> to mean that once a level fills up it gets compacted into a higher level
>
> cheers
>
>
>> Cheers
>> Manoj
>>
>>
>> On Mon, Jun 17, 2013 at 1:54 PM, Franc Carter <franc.carter@sirca.org.au>wrote:
>>
>>> On Mon, Jun 17, 2013 at 2:47 PM, Manoj Mainali <mainalimanoj@gmail.com>wrote:
>>>
>>>> With LeveledCompaction, each sstable size is fixed and is defined by
>>>> sstable_size_in_mb in the compaction configuration of CF definition and
>>>> default value is 5MB. In you case, you may have not defined your own value,
>>>> that is why your each sstable is 5MB. And if you dataset is huge, you will
>>>> see a lot of sstable counts.
>>>>
>>>
>>>
>>> Ok, seems like I do have (at least) an incomplete understanding. I
>>> realise that the minimum size is 5MB, but I thought compaction would merge
>>> these into a smaller number of larger sstables ?
>>>
>>> thanks
>>>
>>>
>>>> Cheers
>>>>
>>>> Manoj
>>>>
>>>>
>>>> On Fri, Jun 7, 2013 at 1:44 PM, Franc Carter <franc.carter@sirca.org.au
>>>> > wrote:
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> We are trialling Cassandra-1.2(.4) with Leveled compaction as it looks
>>>>> like it may be a win for us.
>>>>>
>>>>> The first step of testing was to push a fairly large slab of data into
>>>>> the Column Family - we did this much faster (> x100) than we would
in a
>>>>> production environment. This has left the Column Family with about 140,000
>>>>> files in the Column Family directory which seems way too high. On two
of
>>>>> the nodes the CompactionStats show 2 outstanding tasks and on a third
node
>>>>> there are over 13,000 outstanding tasks. However from looking at the
log
>>>>> activity it looks like compaction has finished on all nodes.
>>>>>
>>>>> Is this number of files expected/normal ?
>>>>>
>>>>> cheers
>>>>>
>>>>> --
>>>>>
>>>>> *Franc Carter* | Systems architect | Sirca Ltd
>>>>>  <marc.zianideferranti@sirca.org.au>
>>>>>
>>>>> franc.carter@sirca.org.au | www.sirca.org.au
>>>>>
>>>>> Tel: +61 2 8355 2514
>>>>>
>>>>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>>>>
>>>>> PO Box H58, Australia Square, Sydney NSW 1215
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> *Franc Carter* | Systems architect | Sirca Ltd
>>>  <marc.zianideferranti@sirca.org.au>
>>>
>>> franc.carter@sirca.org.au | www.sirca.org.au
>>>
>>> Tel: +61 2 8355 2514
>>>
>>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>>
>>> PO Box H58, Australia Square, Sydney NSW 1215
>>>
>>>
>>>
>>
>
>
> --
>
> *Franc Carter* | Systems architect | Sirca Ltd
>  <marc.zianideferranti@sirca.org.au>
>
> franc.carter@sirca.org.au | www.sirca.org.au
>
> Tel: +61 2 8355 2514
>
> Level 4, 55 Harrington St, The Rocks NSW 2000
>
> PO Box H58, Australia Square, Sydney NSW 1215
>
>
>
>


-- 

*Franc Carter* | Systems architect | Sirca Ltd
 <marc.zianideferranti@sirca.org.au>

franc.carter@sirca.org.au | www.sirca.org.au

Tel: +61 2 8355 2514

Level 4, 55 Harrington St, The Rocks NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215

Mime
View raw message