I'd also say consider what happens during maintenance and failure scenarios. Moving 10's TB
around takes a lot longer than 100's GB.
Cheers
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 8 Jun 2011, at 06:40, AJ wrote:
> Thanks to everyone who responded thus far.
>
>
> On 6/7/2011 10:16 AM, Benjamin Coverston wrote:
> <snip>
>> Not to say that there aren't workloads where having many TB/Node doesn't work, but
if you're planning to read from the data you're writing you do want to ensure that your working
set is stored in memory.
>>
>
> Thank you Ben. Can you elaborate some more on the above point? Are you referring to
the OS's working set or the Cassandra caches? Why exactly do I need to ensure this?
>
> I am also wondering if there is any reason I should segregate my frequently write/read
smallish data set (such as usage statistics) from my bulk mostly read-only data set (static
content) into separate CFs if the schema allows it. Would this be of any benefit?
|