incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Number of keyspaces
Date Wed, 23 May 2012 10:09:29 GMT
> We were thinking of doing a major compaction after each year is 'closed off'. 
Not a terrible idea. Years tend to happen annually, so their growth pattern is well understood.


> This would mean that compactions for the current year were dealing with a smaller amount
of data and hence be faster and have less impact on a day-to-day basis.
Older data is compacted into higher tiers / generations so will not be included when compacting
new data (background http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra).
That said, there is a chance that at some point you the big older files get compacted. i.e.
if you get (by default) 4 X 100GB files they will get compacted into 1. 

It feels a bit like a premature optimisation. 
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/05/2012, at 1:52 PM, Franc Carter wrote:

> On Wed, May 23, 2012 at 7:42 AM, aaron morton <aaron@thelastpickle.com> wrote:
> 1 KS with 24 CF's will use roughly the same resources as 24 KS's with 1 CF. Each CF:
> 
> * loads the bloom filter for each SSTable
> * samples the index for each sstable
> * uses row and key cache
> * has a current memtable and potentially memtables waiting to flush.
> * had secondary index CF's
> 
> I would generally avoid a data model that calls for CF's to be added in response to new
entities or new data. Older data will move moved to larger files, and not included in compaction
for newer data.
> 
> We were thinking of doing a major compaction after each year is 'closed off'. This would
mean that compactions for the current year were dealing with a smaller amount of data and
hence be faster and have less impact on a day-to-day basis. Our query patterns will only infrequently
cross year boundaries.
> 
> Are we being naive ?
> 
> cheers
>  
> 
> Hope that helps. 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 23/05/2012, at 3:31 AM, Luís Ferreira wrote:
> 
>> I have 24 keyspaces, each with a columns family and am considering changing it to
1 keyspace with 24 CFs. Would this be beneficial?
>> On May 22, 2012, at 12:56 PM, samal wrote:
>> 
>>> Not ideally, now cass has global memtable tuning. Each cf correspond to memory
 in ram. Year wise cf means it will be in read only state for next year, memtable  will still
consume ram.
>>> 
>>> On 22-May-2012 5:01 PM, "Franc Carter" <franc.carter@sirca.org.au> wrote:
>>> On Tue, May 22, 2012 at 9:19 PM, aaron morton <aaron@thelastpickle.com>
wrote:
>>> It's more the number of CF's than keyspaces.
>>> 
>>> Oh - does increasing the number of Column Families affect performance ?
>>> 
>>> The design we are working on at the moment is considering using a Column Family
per year. We were thinking this would isolate compactions to a more manageable size as we
don't update previous years.
>>> 
>>> cheers
>>>  
>>> 
>>> Cheers
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 22/05/2012, at 6:58 PM, R. Verlangen wrote:
>>> 
>>>> Yes, it does. However there's no real answer what's the limit: it depends
on your hardware and cluster configuration. 
>>>> 
>>>> You might even want to search the archives of this mailinglist, I remember
this has been asked before.
>>>> 
>>>> Cheers!
>>>> 
>>>> 2012/5/21 Luís Ferreira <zamith.28@gmail.com>
>>>> Hi,
>>>> 
>>>> Does the number of keyspaces affect the overall cassandra performance?
>>>> 
>>>> 
>>>> Cumprimentos,
>>>> Luís Ferreira
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> With kind regards,
>>>> 
>>>> Robin Verlangen
>>>> www.robinverlangen.nl
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Franc Carter | Systems architect | Sirca Ltd
>>> franc.carter@sirca.org.au | www.sirca.org.au
>>> Tel: +61 2 9236 9118 
>>> Level 9, 80 Clarence St, Sydney NSW 2000
>>> PO Box H58, Australia Square, Sydney NSW 1215
>>> 
>> 
>> Cumprimentos,
>> Luís Ferreira
>> 
>> 
>> 
> 
> 
> 
> 
> -- 
> Franc Carter | Systems architect | Sirca Ltd
> franc.carter@sirca.org.au | www.sirca.org.au
> Tel: +61 2 9236 9118 
> Level 9, 80 Clarence St, Sydney NSW 2000
> PO Box H58, Australia Square, Sydney NSW 1215
> 


Mime
View raw message