cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: 1000's of column families
Date Tue, 02 Oct 2012 15:55:24 GMT
Ben, Brian, 
   By the way, PlayOrm offers a NoSqlTypedSession that is different than
the ORM half of PlayOrm dealing in raw stuff that does indexing(so you can
do Scalable SQL on data that has no ORM on top of it).  That is what we
use for our 1000's of CF's as we don't know the format of any of those
tables ahead of time(in our world, users tell us the format and wire in
streams through an api we expose AND they tell PlayOrm which columns to
index).  That layer deals with BigInteger, BigDecimal, String and I think
byte[].

So, I am going to add virtual CF's to PlayOrm in the coming week and we
are going to feed in streams and partition the virtual CF's which sit in a
single real CF using PlayOrm partitioning and then we can then query into
each partition.  

The only issue is really what partitions exist and that is left to the
client to keep track of, but if your app knows all the partitions(and that
could be saved to some rows in the nosql store), then I will probably try
out storm after that.

Later,
Dean

On 10/2/12 9:09 AM, "Ben Hood" <0x6e6562@gmail.com> wrote:

>On Tue, Oct 2, 2012 at 3:37 PM, Brian O'Neill <boneill42@gmail.com> wrote:
>> Exactly.
>
>So you're back to the deliberation between using multiple CFs
>(potentially with some known working upper bound*) or feeding your map
>reduce in some other way (as you decided to do with Storm). In my
>particular scenario I'd like to be able to do a combination of some
>batch processing on top of less frequently changing data (hence why I
>was looking at Hadoop) and some real time analytics.
>
>Cheers,
>
>Ben
>
>(*) Not sure whether this applies to an individual keyspace or an
>entire cluster.


Mime
View raw message