cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Hiramoto <k...@hiramoto.org>
Subject Re: 99.999% uptime - Operations Best Practices?
Date Thu, 23 Jun 2011 08:24:56 GMT
On 06/23/11 09:43, David Boxenhorn wrote:
> I think very high uptime, and very low data loss is achievable in
> Cassandra, but, for new users there are TONS of gotchas. You really
> have to know what you're doing, and I doubt that many people acquire
> that knowledge without making a lot of mistakes.
>
> I see above that most people are talking about configuration issues.
> But, the first thing that you will probably do, before you have any
> experience with Cassandra(!), is architect your system. Architecture
> is not easily changed when you bump into a gotcha, and for some reason
> you really have to search the literature well to find out about them.
> So, my contributions:
>
> The too many CFs problem. Cassandra doesn't do well with many column
> families. If you come from a relational world, a real application can
> easily have hundreds of tables. Even if you combine them into entities
> (which is the Cassandra way), you can easily end up with dozens of
> entities. The most natural thing for someone with a relational
> background is have one CF per entity, plus indexes according to your
> needs. Don't do it. You need to store multiple entities in the same
> CF. Group them together according to access patterns (i.e. when you
> use X,  you probably also need Y), and distinguish them by adding a
> prefix to their keys (e.g. entityName@key).

While avoiding too many CF's  is a good idea  I would also advise
against a very large  CF.   Keeping a CF size down, helps speed up
repair and compact.


--
Karl

Mime
View raw message