cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vanger <>
Subject Re: 1000's of CF's.
Date Mon, 08 Oct 2012 10:02:51 GMT
So what solution should be for cassandra architecture when we need to make Hadoop M\R jobs
and not be restricted by number of CF?
What we have now is fair amount of CFs  (> 2K) and this number is slowly growing so we
already planing to merge partitioned CFs. But our next goal is to run hadoop tasks on those
CFs. All we have is plain Hector and custom ORM on top of it. As far as i understand VirtualKeyspace
doesn't help in our case. 
Also i dont understand why not implement support for many CF ( or build-in  partitioning )
on cassandra side. Anybody can explain why this can or cannot be done in cassandra?

Just in case:
We're using cassandra 1.0.11 on 30 nodes (planning upgrade on 1.1.* soon).

W/ best regards, 

On 04.10.2012 0:10, Hiller, Dean wrote:
> Okay, so it only took me two solid days not a week.  PlayOrm in master branch now supports
virtual CF's or virtual tables in ONE CF, so you can have 1000's or millions of virtual CF's
in one CF now.  It works with all the Scalable-SQL, works with the joins, and works with the
PlayOrm command line tool.
> Two ways to do it, if you are using the ORM half, you just annotate
> @NoSqlEntity("MyVirtualCfName")
> @NoSqlVirtualCf(storedInCf="sharedCf")
> So it's stored in sharedCf with the table name of MyVirtualCfName(in command line tool,
use MyVirtualCfName to query the table).
> Then if you don't know your meta data ahead of time, you need to create DboTableMeta
and DboColumnMeta objects and save them for every table you create and can use TypedRow to
read and persist (which is what we have a project doing).
> If you try it out let me know.  We usually get bug fixes in pretty fast if you run into
anything.  (more and more questions are forming on stack overflow as well ;) ).
> Later,
> Dean

View raw message