cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <>
Subject Pygmalion - a github project for pig + cassandra
Date Wed, 27 Apr 2011 18:57:10 GMT
Hi all,

A little while back, I started a project called pygmalion for example scripts and UDFs for
people using Pig with Cassandra.  Currently there are a few handy UDFs in there like:

FromCassandraBag: a way to convert from what Cassandra returns (key:chararray, columns:bag
{column:tuple (name, value)}) to something more tabular (key, value1, value2, value3).  You
specify the values you want to project - it's good for tabular data.
ToCassandraBag: a way to convert from (key, value1, value2, value3) to what Cassandra expects
when writing - (key:chararray, columns:bag {column:tuple (name, value)}) - the column names
are extracted from the variable names in the Pig script.
Both contributed by Jacob Perkins with slight revisions by Jeremy Hanna

StringConcat: probably something everyone implements but instead of CONCAT that only does
two strings, it does any number of strings.

GenerateTimeUUID: a udf that generates a time uuid with or without a time to base it on.

It definitely needs more work and examples, but I've been using the UDFs in there for a while
with Cassandra 0.7.5 (previously 0.7-branch).  Now that 0.7.5 is released, I'd just like to
let people know about it if they would like to contribute or even just use it.
View raw message