drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian O'Neill" <b...@alumni.brown.edu>
Subject Getting plugged in... (Cassandra and Drill?)
Date Mon, 21 Jan 2013 04:37:47 GMT
Last week, Brad Anderson came up and presented at the PhillyDB meetup.
http://www.slideshare.net/boorad/phillydb-talk-beyond-batch

He gave us an overview of Drill, and I'm curious...

Presently, we heavily use Storm + Cassandra.
http://brianoneill.blogspot.com/2012/08/a-big-data-trifecta-storm-kafka-and.html

We treat CRUD operations as events. Then within Storm we calculate
aggregate counts of entities flowing through the system by various
dimensions.   That works well, but we still need an ad hoc reporting
capability, and a way to report on data in the system that is not
active (historical).

Would it be possible to use the Drill engine against a Cassandra backend?
If so, what does that mean?   (implementing some API?)

I assume that performance would be terrible unless somehow the data is
stored using the columnar data format from the Dremel paper.  Is that
accurate?  Does anyone know if anyone has attempted a translation of
that format to Cassandra?

Regardless, I'm very interested in getting involved and no stranger to
getting my hands dirty.
Let me know if you can provide any direction. (our entities are
currently stored in JSON in Cassandra)

-brian


-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42

Mime
View raw message