cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: Pygmalion - a github project for pig + cassandra
Date Wed, 27 Apr 2011 19:31:30 GMT

On Wed, Apr 27, 2011 at 1:57 PM, Jeremy Hanna
<> wrote:
> Hi all,
> A little while back, I started a project called pygmalion for example scripts and UDFs
for people using Pig with Cassandra.  Currently there are a few handy UDFs in there like:
> FromCassandraBag: a way to convert from what Cassandra returns (key:chararray, columns:bag
{column:tuple (name, value)}) to something more tabular (key, value1, value2, value3).  You
specify the values you want to project - it's good for tabular data.
> ToCassandraBag: a way to convert from (key, value1, value2, value3) to what Cassandra
expects when writing - (key:chararray, columns:bag {column:tuple (name, value)}) - the column
names are extracted from the variable names in the Pig script.
> Both contributed by Jacob Perkins with slight revisions by Jeremy Hanna
> StringConcat: probably something everyone implements but instead of CONCAT that only
does two strings, it does any number of strings.
> GenerateTimeUUID: a udf that generates a time uuid with or without a time to base it
> It definitely needs more work and examples, but I've been using the UDFs in there for
a while with Cassandra 0.7.5 (previously 0.7-branch).  Now that 0.7.5 is released, I'd just
like to let people know about it if they would like to contribute or even just use it.

Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support

View raw message