cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Brown <paulrbr...@gmail.com>
Subject Re: R on Cassandra
Date Wed, 09 Nov 2011 19:57:04 GMT

Hi, Brian --

A little late to reply, but I'm slowly catching up.

You're going to be better off, IMHO, to pull the data out of Cassandra with a tool like Pig
(probably with a bit of aggregation and filtering) and then operate on it in R as a static
delimited file.  If you need additional automation or batching (as well as cleaning and aggregation),
you can automate that using various tools.  Some of this depends on your modeling workflow,
but it's not unreasonable to expect that you'll want to return to exactly the same dataset
and repeat some processes as you refine your approach.  It's difficult/impossible to do that
against live data.

-- Paul

On Nov 1, 2011, at 2:02 PM, Brian O'Neill wrote:

> I saw a mention of R on Cassandra:
> http://comments.gmane.org/gmane.comp.db.cassandra.user/5681
> 
> Does anyone know if this has traction somewhere?
> 
> -brian
> 
> -- 
> Brian ONeill
> Lead Architect, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024
> blog: http://weblogs.java.net/blog/boneill42/
> blog: http://brianoneill.blogspot.com/
> 


Mime
View raw message