incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mimi Aluminium <mimi.alumin...@gmail.com>
Subject Re: memory size and disk size prediction tool
Date Tue, 01 Feb 2011 08:01:59 GMT
Aaron,
Thanks a lot for your answer,
I had in mind something more generic that I am currently working on.
The idea is to have a tool with GUI screens where you can feed-in the
various column families you are using with column (names and values) sizes.
Then it will have anotehr screen withh application-aware fields names
associated with their value - all defined by the user. Using these
parameters, this (modeling) tool should be able to calculate disk usage and
hopefully ram usage...
Anyway I am trying to do that for our own case using a simple excel
spreadsheet, let you know when it will be ready,
Thanks,
Miriam

On Thu, Jan 20, 2011 at 11:49 PM, Aaron Morton <aaron@thelastpickle.com>wrote:

>  Not that I know of, do you have an existing test system you can use as a
> baseline ?
>
> For memory have a read of the JVM Heap Size section here
> http://wiki.apache.org/cassandra/MemtableThresholds
> You will also want to have some memory for disk caching and the os. 8 or
> 12gb feels like a good start.
>
> For disk capacity I just did some regular old guess work, and multipled my
> number by 1.25 to
> cover the on disk overhead. You also want to avoid using more than 50% of
> the local disk space, due to
> compaction and the way the disk performance falls away. There is more info
> available here
> http://wiki.apache.org/cassandra/CassandraHardware
>
> How much throughout do you need? How much redundancy do you need? How much
> data do you
> plan to store?
>
> Hope that helps
> Aaron
>
> On 21 Jan, 2011,at 05:04 AM, Mimi Aluminium <mimi.aluminium@gmail.com>
> wrote:
>
>     Hi,
>
> We are implementing a 'middlewear' layer to an underneath storage and
> need to estimate costs for various system configurations.
> Specifically, I want to estimate the resources (memory, disk) for our
> data model.
>
> Is there a tool that  given certain storage configuration parameters,
> column family fields number and sizes and other details, and then
> workload-dependant  parameters such as read/write average rates etc. can
> predict the
> resource consumption (i.e, memory, disk)  in an offline mode?
>
> Thanks,
> Miriam
>
>

Mime
View raw message