hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <ga...@yahoo-inc.com>
Subject Re: Pig rework on the types branch
Date Fri, 10 Oct 2008 22:25:53 GMT

I wanted to talk with you about this.  We aren't requiring UDF  
writers to convert their contributions in piggybank.  However, we're  
not promising when we'll get to it either.  It will be a best effort  
kind of thing.  So, if you'd be willing to update yours, that would  
be great.

For your UDFs the conversion shouldn't be too bad.  The way tuples  
are created has changed.  And you'll want to change from DataAtom to  
String.  Other than that, I think you'll be good to go.

As to whether you want to write your new classes on types or trunk,  
it depends on the version of pig you want to use.  If you're  
comfortable with using the types stuff, I would definitely encourage  
you to work there, since that's the future and it avoids the need for  
future translation.

Thanks for the heads up on the doc, I've fixed it.


On Oct 10, 2008, at 3:12 PM, Earl Cahill wrote:

> Just read through a bit of the doc, and it sounds to me like if I  
> am going to be writing new classes, I should start them on the  
> types branch?  Does that sound right?  It looks at least a bit  
> nontrivial to convert from trunk to the types branch.
> By the way, looks like || isn't being represented properly, like in
> if (input `= null |||| input.size() =` 0)
> Thanks,
> Earl
> ----- Original Message ----
> From: Alan Gates <gates@yahoo-inc.com>
> To: pig-dev@incubator.apache.org; pig-user@incubator.apache.org
> Sent: Friday, October 10, 2008 10:24:06 AM
> Subject: Pig rework on the types branch
> All,
> As you have probably noticed if you've been watching the mailing
> list, much work has gone into an almost complete rework of pig over
> the last six months.  This work has been done on the types branch in
> order to avoid destabilizing the trunk.  This work includes a
> complete rewrite of the backend of pig, including the interface to
> map reduce and the operators that execute a pig script on hadoop.  It
> also introduces a type system to pig.  A number of new features have
> been added and performance has been significantly improved (averaging
> around 2x though varying greatly by script).  And, while we strove to
> be backward compatible whenever possible, there are places where
> changes are required in user scripts or UDFs.  Full details of the
> changes are available at http://wiki.apache.org/pig/ 
> TrunkToTypesChanges
> After much testing by the developers and a number of brave users, we
> feel the code on the types branch is now approaching stability.  We
> would like to suggest that users begin using the code on the types
> branch.  At some point in the near future, we would like to merge the
> types branch into trunk and then do a 0.2.0 release.
> Alan.

View raw message