Thanks for the information Drew and Jonathan.
Is there any difference in performance while using Pig compared to MapReduce directly on data store ?
I will do the experiments with both of them though in some time.
The cassandra column family input format will go over a an entire
column family sending a slice of a row into a mapper at a time. From
there there's a lot you can do. As far as how you aggregate data
together, I'd suggest experimenting with the latest version of Pig
which thankfully supports the new input format. It gives you a
SQL'esque syntax for manipulating the data and is probably the easiest
way to experiment.
On Thu, Jun 24, 2010 at 11:01 AM, Atul Gosain <firstname.lastname@example.org> wrote:
> What kind of Map Reduce support is provided for Cassandra ?
> Can i get some columns from different rows and then aggregate them up
> together. Its basically aggregation of statistics for various devices
> connected to a network manager. Is it a right kind of use case to be
> supported by MR ?