hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: RFC: Major HCatalog refactoring
Date Tue, 03 Sep 2013 17:22:55 GMT
I would say a main goal of unit and integration testing is to try all code
paths. If a testing framework is truly testing all code paths twice, there
is not much of a win there from a unit/integration tests standpoint. If the
unit tests created more coverage of the code that would be an obvious win.
I have not looked at your patch but from your description it sounds like we
are attempting to test a rename that does not sound like a win to me.

If the current hcatalog tests run in 15 minutes, you make a change and then
the run is 30 minutes. 15 minutes is a nice long coffee break, 30 minutes
is a TV show :)

As for the overall hive build taking 10-15 hours. I know that :) I used to
run them, by hand, on my laptop, because no one would share their build
farm with me. I have heard that Hive consumes the vast majority of the
resources of apache's build farm! I think we need to be good citizens at
apache and attempt to make this better, not worse.

Now that we have pre-commit builds we can work at a reasonable pace. Now
that we have this nice pre-commit farm, I do not want to create a precedent
that now we can go "nuts", and start down the same slippery slope.




On Tue, Sep 3, 2013 at 12:57 PM, Eugene Koifman <ekoifman@hortonworks.com>wrote:

> Current (sequential) run of all hive/hcat unit tests takes 10-15 hours.  Is
> another 20-30 minutes that significant?
>
> I'm generally wary of unit tests that are not run continuously and
> automatically.  It delays the detection of problems and then what was
> probably an obvious fix at the time the change was made becomes a long
> debugging session (often by someone other than whose change broke things).
>  I think this is especially true given how many people are contributing to
> hive.
>
>
>
> On Tue, Sep 3, 2013 at 7:25 AM, Brock Noland <brock@cloudera.com> wrote:
>
> > OK that should be fine.  Though I would echo Edwards sentiment about
> > adding so much test time. Do these tests have to run each time? Does
> > it make sense to have an test target such as test-all-hcatalog and
> > then have then run them periodically manually, especially before
> > releases?
> >
> > On Mon, Sep 2, 2013 at 10:36 AM, Eugene Koifman
> > <ekoifman@hortonworks.com> wrote:
> > > These will be new (I.e. 0.11 version) test classes which will be in the
> > old
> > > org.apache.hcatalog package.  How does that affect the new framework?
> > >
> > > On Saturday, August 31, 2013, Brock Noland wrote:
> > >
> > >> Will these be new Java class files or new test methods to existing
> > >> classes?  I am just curious as to how this will play into the
> > >> distributed testing framework.
> > >>
> > >> On Sat, Aug 31, 2013 at 10:19 AM, Eugene Koifman
> > >> <ekoifman@hortonworks.com> wrote:
> > >> > not quite double but close  (on my Mac that means it will go up from
> > 35
> > >> > minutes to 55-60) so in greater scheme of things it should be
> > negligible
> > >> >
> > >> >
> > >> >
> > >> > On Sat, Aug 31, 2013 at 7:35 AM, Edward Capriolo <
> > edlinuxguru@gmail.com
> > >> >wrote:
> > >> >
> > >> >> By coverage do you mean to say that:
> > >> >>
> > >> >> > Thus, the published HCatalog JARs will contain both packages
and
> > the
> > >> unit
> > >> >> > tests will cover both versions of the API.
> > >> >>
> > >> >> We are going to double the time of unit tests for this module?
> > >> >>
> > >> >>
> > >> >> On Fri, Aug 30, 2013 at 8:41 PM, Eugene Koifman <
> > >> ekoifman@hortonworks.com
> > >> >> >wrote:
> > >> >>
> > >> >> > This will change every file under hcatalog so it has to happen
> > before
> > >> the
> > >> >> > branching.  Most likely at the beginning of next week.
> > >> >> >
> > >> >> > Thanks
> > >> >> >
> > >> >> >
> > >> >> > On Wed, Aug 28, 2013 at 5:24 PM, Eugene Koifman <
> > >> >> ekoifman@hortonworks.com
> > >> >> > >wrote:
> > >> >> >
> > >> >> > > Hi,
> > >> >> > >
> > >> >> > >
> > >> >> > > Here is the plan for refactoring HCatalog as was agreed
to when
> > it
> > >> was
> > >> >> > > merged into Hive during.  HIVE-4869 is the umbrella
bug for
> this
> > >> work.
> > >> >> >  The
> > >> >> > > changes are complex and touch every single file under
hcatalog.
> > >>  Please
> > >> >> > > comment.
> > >> >> > >
> > >> >> > > When HCatalog project was merged into Hive on 0.11 several
> > >> integration
> > >> >> > > items did not make the 0.11 deadline.  It was agreed
to finish
> > them
> > >> in
> > >> >> > 0.12
> > >> >> > > release.  Specifically:
> > >> >> > >
> > >> >> > > 1. HIVE-4895 - change package name from org.apache.hcatalog
to
> > >> >> > > org.apache.hive.hcatalog
> > >> >> > >
> > >> >> > > 2. HIVE-4896 - create binary backwards compatibility
layer for
> > hcat
> > >> >> users
> > >> >> > > upgrading from 0.11 to 0.12
> > >> >> > >
> > >> >> > > For item 1, we’ll just move every file under
> org.apache.hcatalog
> > to
> > >> >> > > org.apache.hive.hcatalog and update all “package”
and “import”
> > >> >> statement
> > >> >> > as
> > >> >> > > well as all hcat/webhcat scripts.  This will include
all JUnit
> > >> tests.
> > >> >> > >
> > >> >> > > Item 2 will ensure that if a user has a M/R program
or Pig
> > script,
> > >> etc.
> > >> >> > > that uses HCatalog public API, their programs will continue
to
> > work
> > >> w/o
> > >> >> > > change with hive 0.12.
> > >> >> > >
> > >> >> > > The proposal is to make the changes that have as little
impact
> on
> > >> the
> > >> >> > > build system, in part to make upcoming ‘mavenization’
of hive
> > >> easier,
> > >> >> in
> > >> >> > > part to make the changes more manageable.
> > >> >> > >
> > >> >> > >
> > >> >> > >
> > >> >> > > The list of public interfaces (and their transitive
closure)
> for
> > >> which
> > >> >> > > backwards compat will be provided.
> > >> >> > >
> > >> >> > >    1.
> > >> >> > >
> > >> >> > >    HCatLoader
> > >> >> > >    2.
> > >> >> > >
> > >> >> > >    HCatStorer
> > >> >> > >    3.
> > >> >> > >
> > >> >> > >    HCatInputFormat
> > >> >> > >    4.
> > >> >> > >
> > >> >> > >    HCatOutputFormat
> > >> >> > >    5.
> > >> >> > >
> > >> >> > >    HCatReader
> > >> >> > >    6.
> > >> >> > >
> > >> >> > >    HCatWriter
> > >> >> > >    7.
> > >> >> > >
> > >> >> > >    HCatRecord
> > >> >> > >    8.
> > >> >> > >
> > >> >> > >    HCatSchema
> > >> >> > >
> > >> >> > >
> > >> >> > > To achieve this, 0.11 version of these classes will
be added in
> > >> >> > > org.apache.hcatalog package (after item 1 is done).
 Each of
> > these
> > >> >> > classes
> > >> --
> > >> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
> > >>
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> >
> >
> >
> > --
> > Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message