cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Boxenhorn <>
Subject Re: Do supercolumns have a purpose?
Date Thu, 03 Feb 2011 12:33:18 GMT
Thanks Sylvain!

Can I vote for internally implementing supercolumn families as regular
column families? (With a smooth upgrade process that doesn't require
shutting down a live cluster.)

What if supercolumn families were supported as regular column families + an
index (on what used to be supercolumn keys)? Would that solve some problems?

On Thu, Feb 3, 2011 at 2:00 PM, Sylvain Lebresne <>wrote:

> > Is there any advantage to using supercolumns
> > (columnFamilyName[superColumnName[columnName[val]]]) instead of regular
> > columns with concatenated keys
> > (columnFamilyName[superColumnName@columnName[val]])?
> >
> > When I designed my data model, I used supercolumns wherever I needed two
> > levels of key depth - just because they were there, and I figured that
> they
> > must be there for a reason.
> >
> > Now I see that in 0.7 secondary indexes don't work on supercolumns or
> > subcolumns (is that right?), which seems to me like a very serious
> > limitation of supercolumn families.
> >
> > It raises the question: Is there anything that supercolumn families are
> good
> > for?
> There is a bunch of queries that you cannot do (or less conveniently) if
> you
> encode super columns using regular columns with concatenated keys:
> 1) If you use regular columns with concatenated keys, the count argument
> count simple columns. With super columns it counts super columns. It means
> that you can't do "give me the 10 first super columns of this row".
> 2) If you need to get x super columns by name, you'll have to issue x
> get_slice query (one of each super column). On the client side it sucks.
> Internally in Cassandra we could do it reasonably well though.
> 3) You cannot remove entire super columns since there is no support for
> range
> deletions.
> Moreover, the encoding with concatenated keys uses more disk space (and
> less
> disk used for the same information means less things to read so it may have
> a slight impact on read performance too -- it's probably really slight on
> most
> usage but nevertheless).
> > And here's a related question: Why can't Cassandra implement supercolumn
> > families as regular column families, internally, and give you that
> > functionality?
> For the 1) and 2) above, we could deal with those internally fairly easily
> I
> think and rather well (which means it wouldn't be much worse
> performance-wise
> than with the actual implementaion of super columns, not that it would be
> better). For 3), range deletes are harder and would require more
> significant
> changes (that doesn't mean that Cassandra will never have it). Even without
> that, there would be the disk space lost.
> --
> Sylvain

View raw message