pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Jurney <russell.jur...@gmail.com>
Subject Re: pig 0.11 candidate 2 feedback: Several problems
Date Wed, 20 Feb 2013 19:33:20 GMT
Isn't the point of an RC to find and fix bugs like these>


On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham <billgraham@gmail.com> wrote:

> Regarding Pig 11 rc2, I propose we continue with the current vote as is
> (which closes today EOD). Patches for 0.20.2 issues can be rolled into a
> Pig 0.11.1 release whenever they're available and tested.
>
>
>
> On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich <onatkovich@yahoo.com
> >wrote:
>
> > I agree that supporting as much as we can is a good goal. The issue is
> who
> > is going to be testing against all these versions? We found the issues
> > under discussion because of a customer report, not because we
> consistently
> > test against all versions. Perhaps when we decide which versions to
> support
> > for next release we need also to agree who is going to be testing and
> > maintaining compatibility with a particular version.
> >
> > For instance since Hadoop 23 compatibility is important for us at Yahoo
> we
> > have been maintaining compatibility with this version for 0.9, 0.10 and
> > will do the same for 0.11 and going forward. I think we would need others
> > to step in and claim the versions of their interest.
> >
> > Olga
> >
> >
> > ________________________________
> >  From: Kai Londenberg <kai.londenberg@googlemail.com>
> > To: dev@pig.apache.org
> > Sent: Wednesday, February 20, 2013 1:51 AM
> > Subject: Re: pig 0.11 candidate 2 feedback: Several problems
> >
> > Hi,
> >
> > I stronly agree with Jonathan here. If there are good reasons why you
> > can't support an older version of Hadoop any more, that's one thing.
> > But having to change 2 lines of code doesn't really qualify as such in
> > my point of view ;)
> >
> > At least for me, pig support for 0.20.2 is essential - without it, I
> > can't use it. If it doesn't support it, I'll have to branch pig and
> > hack it myself, or stop using it.
> >
> > I guess, there are a lot of people still running 0.20.2 Clusters. If
> > you really have lots of data stored on HDFS and a continuously busy
> > cluster, an upgrade is nothing you do "just because".
> >
> >
> > 2013/2/20 Jonathan Coveney <jcoveney@gmail.com>:
> > > I agree that we shouldn't have to support old versions forever. That
> > said,
> > > I also don't think we should be too blase about supporting older
> versions
> > > where it is not odious to do so. We have a lot of competition in the
> > > language space and the broader the versions we can support, the better
> > > (assuming it isn't too odious to do so). In this case, I don't think it
> > > should be too hard to change ObjectSerializer so that the commons-codec
> > > code used is compatible with both versions...we could just in-line some
> > of
> > > the Base64 code, and comment accordingly.
> > >
> > > That said, we also should be clear about what versions we support, but
> > 6-12
> > > months seems short. The upgrade cycles on Hadoop are really, really
> long.
> > >
> > >
> > > 2013/2/20 Prashant Kommireddi <prash1784@gmail.com>
> > >
> > >> Agreed, that makes sense. Probably supporting older hadoop version for
> > a 1
> > >> or 2 pig releases before moving to a newer/stable version?
> > >>
> > >> Having said that, should we use 0.11 period to communicate the same to
> > the
> > >> community and start moving on 0.12 onwards? I know we are way past
> 6-12
> > >> months (1-2 release) time frame with 0.20.2, but we also need to make
> > sure
> > >> users are aware and plan accordingly.
> > >>
> > >> I'd also be interested to hear how other projects (Hive, Oozie) are
> > >> handling this.
> > >>
> > >> -Prashant
> > >>
> > >> On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich <onatkovich@yahoo.com
> > >> >wrote:
> > >>
> > >> > It seems that for each Pig release we need to agree and clearly
> state
> > >> > which Hadoop versions it will support. I guess the main question is
> > how
> > >> we
> > >> > decide on this. Perhaps we should say that Pig no longer supports
> > older
> > >> > Hadoop versions once the newer one is out for at least 6-12 month
to
> > make
> > >> > sure it is stable. I don't think we can support old versions
> > >> indefinitely.
> > >> > It is in everybody's interest to keep moving forward.
> > >> >
> > >> > Olga
> > >> >
> > >> >
> > >> > ________________________________
> > >> >  From: Prashant Kommireddi <prash1784@gmail.com>
> > >> > To: dev@pig.apache.org
> > >> > Sent: Tuesday, February 19, 2013 10:57 AM
> > >> > Subject: Re: pig 0.11 candidate 2 feedback: Several problems
> > >> >
> > >> > What do you guys feel about the JIRA to do with 0.20.2 compatibility
> > >> > (PIG-3194)? I am interested in discussing the strategy around
> backward
> > >> > compatibility as this is something that would haunt us each time we
> > move
> > >> to
> > >> > the next hadoop version. For eg, we might be in a similar situation
> > while
> > >> > moving to Hadoop 2.0, when some of the stuff might break for 1.0.
> > >> >
> > >> > I feel it would be good to get this JIRA fix in for 0.11, as 0.20.2
> > users
> > >> > might be caught unaware. Of course, I must admit there is selfish
> > >> interest
> > >> > here and it's probably easier for us to have a workaround on Pig
> > rather
> > >> > than upgrade hadoop in all our production DCs.
> > >> >
> > >> > -Prashant
> > >> >
> > >> >
> > >> > On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney <
> > >> russell.jurney@gmail.com
> > >> > >wrote:
> > >> >
> > >> > > I think someone should step up and fix the easy ones, if possible.
> > >> > >
> > >> > >
> > >> > > On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham <
> billgraham@gmail.com>
> > >> > wrote:
> > >> > >
> > >> > > > Thanks Kai for reporting these.
> > >> > > >
> > >> > > > What do people think about the severity of these issues
w.r.t.
> Pig
> > >> 11?
> > >> > I
> > >> > > > see a few possible options:
> > >> > > >
> > >> > > > 1. We include some or all of these patches in a new Pig
11 rc.
> > We'd
> > >> > want
> > >> > > to
> > >> > > > make sure that they don't destabilize the current branch.
This
> > >> approach
> > >> > > > makes sense if we think Pig 11 wouldn't be a good release
> without
> > one
> > >> > or
> > >> > > > more of these included.
> > >> > > >
> > >> > > > 2. We continue with the Pig 11 release without these, but
then
> > >> include
> > >> > > one
> > >> > > > or more in a 0.11.1 release.
> > >> > > >
> > >> > > > 3. We continue with the Pig 11 release without these, but
then
> > >> include
> > >> > > them
> > >> > > > in a 0.12 release.
> > >> > > >
> > >> > > > Jon has a patch for the MAP issue
> > >> > > > (PIG-3144<https://issues.apache.org/jira/browse/PIG-3144>)
> > >> > > > ready, which seems like the most pressing of the three to
me.
> > >> > > >
> > >> > > > thanks,
> > >> > > > Bill
> > >> > > >
> > >> > > > On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg <
> > >> > > > kai.londenberg@googlemail.com> wrote:
> > >> > > >
> > >> > > > > Hi,
> > >> > > > >
> > >> > > > > I just subscribed to the dev mailing list in order
to give you
> > some
> > >> > > > > feedback on pig 0.11 candidate 2.
> > >> > > > >
> > >> > > > > The following three issues are currently present in
0.11
> > candidate
> > >> 2:
> > >> > > > >
> > >> > > > > https://issues.apache.org/jira/browse/PIG-3144 - 'Erroneous
> map
> > >> > entry
> > >> > > > > alias resolution leading to "Duplicate schema alias"
errors'
> > >> > > > > https://issues.apache.org/jira/browse/PIG-3194 - Changes
to
> > >> > > > > ObjectSerializer.java break compatibility with Hadoop
0.20.2
> > >> > > > > https://issues.apache.org/jira/browse/PIG-3195 - Race
> > Condition in
> > >> > > > > PhysicalOperator leads to ExecException "Error while
trying to
> > get
> > >> > > > > next result in POStream"
> > >> > > > >
> > >> > > > > The last two of these are easily solveable (see the
tickets
> for
> > >> > > > > details on that). The first one is a bit trickier I
think, but
> > at
> > >> > > > > least there is a workaround for it (pass Map fields
through an
> > UDF)
> > >> > > > >
> > >> > > > > In my personal opinion, each of these problems is pretty
> severe,
> > >> but
> > >> > > > > opinions about the importance of the MAP Datatype and
STREAM
> > >> > Operator,
> > >> > > > > as well as Hadoop 0.20.2 compatibility might differ.
> > >> > > > >
> > >> > > > > so far ..
> > >> > > > >
> > >> > > > > Kai Londenberg
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > *Note that I'm no longer using my Yahoo! email address.
Please
> > email
> > >> me
> > >> > > at
> > >> > > > billgraham@gmail.com going forward.*
> > >> > > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > >> > > datasyndrome.com
> > >> > >
> > >> >
> > >>
> >
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> billgraham@gmail.com going forward.*
>



-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message