hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <ga...@yahoo-inc.com>
Subject Re: requirements for Pig 1.0?
Date Wed, 24 Jun 2009 18:42:21 GMT
To be clear, going to 1.0 is not about having a certain set of  
features.  It is about stability and usability.  When a project  
declares itself 1.0 it is making some guarantees regarding the  
stability of its interfaces (in Pig's case this is Pig Latin, UDFs,  
and command line usage).  It is also declaring itself ready for the  
world at large, not just the brave and the free.  New features can  
come in as experimental once we're 1.0, but the semantics of the  
language and UDFs can't be shifting (as we've done the last several  
releases and will continue to do for a bit I think).

With that in mind, further comments inlined.

On Jun 24, 2009, at 10:18 AM, Dmitriy Ryaboy wrote:

> Alan, any thoughts on performance baselines and benchmarks?
Meaning do we need to reach a certain speed before 1.0?  I don't think  
so.  Pig is fast enough now that many people find it useful.  We want  
to continue working to shrink the gap between Pig and MR, but I don't  
see this as a blocker for 1.0.

> I am a little surprised that you think SQL is a requirement for 1.0,  
> since
> it's essentially an overlay, not core functionality.
If we were debating today whether to go 1.0, I agree that we would not  
wait for SQL.  But given that we aren't (at least I wouldn't vote for  
it now) and that SQL will be in soon, it will need to stabilize.
> What about the storage layer rewrite (or is that what you referred  
> to with
> your first bullet-point)?
To be clear, the Zebra (columnar store stuff) is not a rewrite of the  
storage layer.  It is an additional storage option we want to  
support.  We aren't changing current support for load and store.

> Also, the subject of making more (or all) operators nestable within a
> foreach comes up now and then.. would you consider this important  
> for 1.0,
> or something that can wait?
This would be an added feature, not a semantic change in Pig Latin.

> Integration with other languages (a-la PyPig)?
Again, this is a new feature, not a stability issue.

> The Roadmap on the Wiki is still "as of Q3 2007".... makes it hard  
> for an
> outside contributor to know where to jump :-).
Agreed.  Olga has given me the task of updating this soon.  I'm going  
to try to get to that over the next couple of weeks.  This discussion  
will certainly provide input to that update.


View raw message