drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul challapalli <challapallira...@gmail.com>
Subject Re: [DISCUSS] Publishing advanced/functional tests
Date Fri, 21 Aug 2015 01:54:40 GMT
Ramana,

Yes the plan is to have it out with 1.2 and the work is under progress.

- Rahul

On Mon, Aug 17, 2015 at 10:27 AM, Chun Chang <cchang@maprtech.com> wrote:

> Hi Ramana,
>
> Glad to see your post here. I agree with your point that we should have a
> way for public to run all the pre-commit tests. I feel that's a higher
> priority than anything else since with that, people can commit their
> patches.
>
> Thanks,
> Chun
>
> On Fri, Aug 14, 2015 at 11:33 AM, Ramana I N <inramana@gmail.com> wrote:
>
> > So what is the status on this? It would be nice to have this out with 1.2
> > coming out.
> >
> > Regards
> > Ramana
> >
> >
> >
> > On Wed, Aug 5, 2015 at 11:08 AM, Abhishek Girish <
> > abhishek.girish@gmail.com>
> > wrote:
> >
> > > Ramana,
> > >
> > > I think the issue with licenses is mostly resolved. It was discussed
> that
> > > for TPC-*, since we shall not be redistributing the data-gen software,
> > but
> > > distributing a randomized variant of the data generated by it, we
> should
> > be
> > > okay to include it part of our framework. For other datasets, we shall
> > > either provide their copy of license with our framework, or simply
> > provide
> > > a link for users to download data before they execute.
> > >
> > > For now we should focus on having the framework out with minimal
> cleanup.
> > > In near future we can work on setting up infrastructure and enhancing
> the
> > > framework itself.
> > >
> > > -Abhishek
> > >
> > > On Wed, Aug 5, 2015 at 10:46 AM, Ramana I N <inramana@gmail.com
> > > <javascript:_e(%7B%7D,'cvml','inramana@gmail.com');>> wrote:
> > >
> > > > @Jacques, Ted
> > > >
> > > > in the mean time, we risk patches being merged that have less than
> > > complete
> > > > > testing.
> > > >
> > > >
> > > > While I agree with the premise of getting the tests out as soon as
> > > possible
> > > > it does not help us achieve anything except transparency. Your
> > statement
> > > > that getting the tests out will increase quality is dependent on
> > someone
> > > > actually being able to run the tests once they have access to it.
> > > >
> > > > Maybe we should focus on making a jenkins job to run the tests
> > publicly.
> > > > With that in place we can exclude the TPC* datasets as well as the
> yelp
> > > > data sets from the framework and avoid licensing issues.
> > > >
> > > > Regards
> > > > Ramana
> > > >
> > > >
> > > > On Tue, Aug 4, 2015 at 11:39 AM, Abhishek Girish <
> > > > abhishek.girish@gmail.com
> > > > <javascript:_e(%7B%7D,'cvml','abhishek.girish@gmail.com');>>
> > > > wrote:
> > > >
> > > > > We not only re-distribute external data-sets as-is, but also
> include
> > > > > variants for those (text -> parquet, json, ...). So the challenge
> > here
> > > is
> > > > > not simply disabling automatic downloads via the framework, and
> point
> > > > users
> > > > > to manually download the files before running the framework, but
> also
> > > > about
> > > > > how we will handle tests which require variants of the data sets.
> It
> > > > simply
> > > > > isn't practical to users of the framework to (1) download data-gen
> > > > manually
> > > > > (2) use specific seed / options before generating data, (3) convert
> > > them
> > > > to
> > > > > parquet, etc.. (4) move them to specific locations inside their
> copy
> > of
> > > > the
> > > > > framework.
> > > > >
> > > > > Something we'll need to know is how other projects are handling
> > > > bench-mark
> > > > > & other external datasets.
> > > > >
> > > > > -Abhishek
> > > > >
> > > > > On Tue, Aug 4, 2015 at 11:23 AM, rahul challapalli <
> > > > > challapallirahul@gmail.com
> > > > <javascript:_e(%7B%7D,'cvml','challapallirahul@gmail.com');>>
wrote:
> > > > >
> > > > > > Thanks for your inputs.
> > > > > >
> > > > > > Once issue with just publishing the tests in their current state
> is
> > > > that,
> > > > > > the framework re-distributes tpch, tpcds, yelp data sets without
> > > > > requiring
> > > > > > the users to accept their relevant licenses. A good number of
> tests
> > > > uses
> > > > > > these data sets. Any thoughts on how to handle this?
> > > > > >
> > > > > > - Rahul
> > > > > >
> > > > > > On Wed, Jul 29, 2015 at 12:07 AM, Ted Dunning <
> > ted.dunning@gmail.com
> > > > <javascript:_e(%7B%7D,'cvml','ted.dunning@gmail.com');>>
> > > > > > wrote:
> > > > > >
> > > > > > > +1.  Get it out there.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Jul 28, 2015 at 10:12 PM, Jacques Nadeau <
> > > jacques@dremio.com
> > > > <javascript:_e(%7B%7D,'cvml','jacques@dremio.com');>>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hey Rahul,
> > > > > > > >
> > > > > > > > My suggestion would be to the lower bar--do the absolute
bare
> > > > minimum
> > > > > > to
> > > > > > > > get the tests out there.  For example, simply remove
> > proprietary
> > > > > > > > information and then get it on a public github (whether
your
> > > > personal
> > > > > > > > github or a corporate one).  From there, people can
help by
> > > > > submitting
> > > > > > > pull
> > > > > > > > requests to improve the infrastructure and harness.
 Making
> > > things
> > > > > > easier
> > > > > > > > is something that can be done over time.  For example,
we've
> > had
> > > > > offers
> > > > > > > > from a couple different Linux Admins to help on something.
> I'm
> > > > sure
> > > > > > that
> > > > > > > > they could help with a number of the items you've
identified.
> > In
> > > > the
> > > > > > > mean
> > > > > > > > time, we risk patches being merged that have less
than
> complete
> > > > > > testing.
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Jacques Nadeau
> > > > > > > > CTO and Co-Founder, Dremio
> > > > > > > >
> > > > > > > > On Mon, Jul 27, 2015 at 2:16 PM, rahul challapalli
<
> > > > > > > > challapallirahul@gmail.com
> > > > <javascript:_e(%7B%7D,'cvml','challapallirahul@gmail.com');>>
wrote:
> > > > > > > >
> > > > > > > > > Jacques,
> > > > > > > > >
> > > > > > > > > I am breaking down steps 1,2 & 3 into sub-tasks
so we can
> > > > > > > add/prioritize
> > > > > > > > > these tasks
> > > > > > > > >
> > > > > > > > > Item #TaskSub-TaskCommentsPriority1*Publish the
tests*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Remove Proprietary Data & Queries
> > > > > > > > > 0
> > > > > > > > >
> > > > > > > > > Redact Propriety Data/Queries
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Move tests into drill repo
> > > > > > > > > This requires some refactoring to the framework
code since
> > the
> > > > test
> > > > > > > > > framework uses a 2-level directory structure
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Organize the tests using a label based approach
> > > > > > > > > This involves code changes and moving a lot of
files. When
> > > doing
> > > > a
> > > > > > one
> > > > > > > > time
> > > > > > > > > push it might be better to do this before publishing
the
> > tests?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Each suite should be independentSome suites wrongly
assume
> > that
> > > > the
> > > > > > > data
> > > > > > > > is
> > > > > > > > > present. They should be identified and fixed
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Cleanup hardcoded dependencies during data generationSome
> > > > data-gen
> > > > > > > > scripts
> > > > > > > > > have hard-coded references
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Cleanup downloadsThe same dataset is being downloaded
> > multiple
> > > > > times
> > > > > > by
> > > > > > > > > different suites
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Licenses for downloadsThe framework downloads
some files
> > > > > > automatically.
> > > > > > > > > These files are publicly available.
> > > > > > > > > However before downloading them users need to
agree to
> > certain
> > > > > terms.
> > > > > > > By
> > > > > > > > > using the framework users might be skipping this
step. We
> > > should
> > > > > look
> > > > > > > > into
> > > > > > > > > this
> > > > > > > > > 2*Setup a cluster infrastructure to run the pre-commit
> tests*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 3*Local debugging of tests*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Add an optional maven target for running tests
on a local
> > > machine
> > > > > > > > > Tests can launch an embedded drillbit or they
can connect
> to
> > a
> > > > > > running
> > > > > > > > > drillbit through zookeeper
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Running suites which require additional setup
(hive, hbase
> > etc)
> > > > > > should
> > > > > > > be
> > > > > > > > > made optional
> > > > > > > > >
> > > > > > > > > 4*Documentation*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Running Tests (options available and also listing
the
> asumed
> > > > > > defaults)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Explaining how tests are organized
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Process for adding a new suite
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Jul 24, 2015 at 1:40 PM, Jacques Nadeau
<
> > > > > jacques@dremio.com <javascript:_e(%7B%7D,'cvml','
> jacques@dremio.com
> > > ');>>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Let's get number one done (tests out there
so all
> community
> > > > > members
> > > > > > > can
> > > > > > > > > run
> > > > > > > > > > them).  Then the whole community can work
together to
> solve
> > > the
> > > > > > rest.
> > > > > > > > > >
> > > > > > > > > > I don't think the base install should include
integration
> > > test
> > > > > > > > execution.
> > > > > > > > > > I do think the tests should be in the main
repo (as
> opposed
> > > to
> > > > a
> > > > > > > > > > secondary).
> > > > > > > > > >
> > > > > > > > > > We should strive to ultimately make running
these
> > integration
> > > > > > tests a
> > > > > > > > > > requirement for merging.  We need to complete
all the
> steps
> > > > > before
> > > > > > we
> > > > > > > > can
> > > > > > > > > > impose that.  I should be able to help on
the global run
> > > > > component
> > > > > > > and
> > > > > > > > > > supporting infrastructure.
> > > > > > > > > >
> > > > > > > > > > J
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Jacques Nadeau
> > > > > > > > > > CTO and Co-Founder, Dremio
> > > > > > > > > >
> > > > > > > > > > On Fri, Jul 24, 2015 at 1:29 PM, rahul challapalli
<
> > > > > > > > > > challapallirahul@gmail.com
> > > > <javascript:_e(%7B%7D,'cvml','challapallirahul@gmail.com');>>
wrote:
> > > > > > > > > >
> > > > > > > > > > > Ramana,
> > > > > > > > > > >
> > > > > > > > > > > You are right. We are trying to address
multiple issues
> > > here,
> > > > > but
> > > > > > > not
> > > > > > > > > > with
> > > > > > > > > > > a single solution. I am summarizing
them
> > > > > > > > > > >
> > > > > > > > > > > 1. Tests should be visible to everyone
(Implicit goal)
> > > > > > > > > > > 2. Before applying a patch we should
run tests in a
> > > clustered
> > > > > > > > > > environment.
> > > > > > > > > > > Parth had a suggestion(#4) in his original
email.
> > > > > > > > > > > 3. Developers should be able to debug
majority of the
> > tests
> > > > on
> > > > > > > their
> > > > > > > > > > local
> > > > > > > > > > > environment. I made a few suggestions
above to this
> > regard
> > > > > > > > > > >
> > > > > > > > > > > - Rahul
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Jul 24, 2015 at 10:40 AM, Ramana
I N <
> > > > > inramana@gmail.com <javascript:_e(%7B%7D,'cvml','
> inramana@gmail.com
> > > ');>
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > One important thing which we need
to be clear on here
> > is
> > > > what
> > > > > > are
> > > > > > > > we
> > > > > > > > > > > trying
> > > > > > > > > > > > to address?
> > > > > > > > > > > >
> > > > > > > > > > > > I feel there are two separate
issues here and I do
> not
> > > > think
> > > > > > one
> > > > > > > > > > solution
> > > > > > > > > > > > will fit both the issues.
> > > > > > > > > > > >
> > > > > > > > > > > >    1. Allowing developers to run
tests on their local
> > box
> > > > so
> > > > > > they
> > > > > > > > > know
> > > > > > > > > > > the
> > > > > > > > > > > >    changes they have are not completely
wrong.
> > > > > > > > > > > >    2. Allowing transparency in
the integration tests
> > > > process
> > > > > > > which
> > > > > > > > is
> > > > > > > > > > > >    currently a black box.
> > > > > > > > > > > >
> > > > > > > > > > > > 1 is needed for developers to
make changes and have
> an
> > > idea
> > > > > > that
> > > > > > > > > their
> > > > > > > > > > > > changes are not going to fail
tests en masse in the
> > > > > integration
> > > > > > > > > suite.
> > > > > > > > > > 2
> > > > > > > > > > > is
> > > > > > > > > > > > needed because its a prerequisite
for changes to be
> > > > > committed.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Regards
> > > > > > > > > > > > Ramana
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Jul 24, 2015 at 10:28
AM, rahul challapalli <
> > > > > > > > > > > > challapallirahul@gmail.com
> > > > <javascript:_e(%7B%7D,'cvml','challapallirahul@gmail.com');>>
wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Ramana,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Let me fill in more details.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. Before we accept a patch
we want to make sure
> the
> > > > tests
> > > > > > run
> > > > > > > > in a
> > > > > > > > > > > > cluster
> > > > > > > > > > > > > environment. No exceptions
here.
> > > > > > > > > > > > > 2. We want  the contributors
to be able to debug
> the
> > > > > failing
> > > > > > > > tests
> > > > > > > > > on
> > > > > > > > > > > > their
> > > > > > > > > > > > > laptops in as many cases
as possbile. This
> requires :
> > > > > > > > > > > > >         1. Tests should run
on top of a local file
> > > > system.
> > > > > > > (Tests
> > > > > > > > > can
> > > > > > > > > > > > > launch an embedded drillbit
or they can connect to
> a
> > > > > running
> > > > > > > > > drillbit
> > > > > > > > > > > > > through zookeeper)
> > > > > > > > > > > > >         2. Running suites
which require additional
> > > setup
> > > > > > (hive,
> > > > > > > > > hbase
> > > > > > > > > > > > etc)
> > > > > > > > > > > > > should be made optional and
sufficient
> documentation
> > > > should
> > > > > > be
> > > > > > > > > > provided
> > > > > > > > > > > > for
> > > > > > > > > > > > > enabling and disabling these
tests.
> > > > > > > > > > > > > 3. In my opinion making these
new tests part of
> drill
> > > > would
> > > > > > > make
> > > > > > > > it
> > > > > > > > > > > > easier
> > > > > > > > > > > > > for the developers to debug
and run tests instead
> of
> > > > > having a
> > > > > > > > > > different
> > > > > > > > > > > > > repository. But as you said
it might bloat the
> drill
> > > > > project
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Rahul
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Jul 24, 2015 at 9:42
AM, Ted Dunning <
> > > > > > > > > ted.dunning@gmail.com
> > > > <javascript:_e(%7B%7D,'cvml','ted.dunning@gmail.com');>>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > The Hadoop family of
projects has some software
> > that
> > > > > > > > integrates a
> > > > > > > > > > > > > > continuous integration
system so that every time
> a
> > > JIRA
> > > > > is
> > > > > > > > marked
> > > > > > > > > > as
> > > > > > > > > > > > > > patch-available, the
associated patch attached to
> > the
> > > > bug
> > > > > > > will
> > > > > > > > > have
> > > > > > > > > > > > > > integration tests run
against it.  I believe that
> > > there
> > > > > has
> > > > > > > > been
> > > > > > > > > > some
> > > > > > > > > > > > > > process to use git hashes
instead of patches.
> The
> > CI
> > > > > > results
> > > > > > > > are
> > > > > > > > > > put
> > > > > > > > > > > > > back
> > > > > > > > > > > > > > on the JIRA.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > This is done using a
fairly simple set of
> scripts.
> > > > > Apache
> > > > > > > > Yetus
> > > > > > > > > is
> > > > > > > > > > > > just
> > > > > > > > > > > > > > forming as a direct-to-top-level
spinoff from
> > Hadoop
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Proposal is here (don't
be fooled by the fact
> that
> > it
> > > > > looks
> > > > > > > > like
> > > > > > > > > an
> > > > > > > > > > > > > > incubation proposal):
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > http://wiki.apache.org/incubator/YetusProposal
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Early code can be found
here (don't guess that
> this
> > > is
> > > > > very
> > > > > > > > real
> > > > > > > > > > > yet).
> > > > > > > > > > > > > > More links can be found
in the proposal.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > https://github.com/sekikn/pre-yetus/tree/master/precommit/docs
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The project has not
yet been formed and there are
> > no
> > > > > > mailing
> > > > > > > > > lists
> > > > > > > > > > or
> > > > > > > > > > > > git
> > > > > > > > > > > > > > repo yet.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Jul 24, 2015
at 9:25 AM, Ramana I N <
> > > > > > > > inramana@gmail.com
> > > > <javascript:_e(%7B%7D,'cvml','inramana@gmail.com');>>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > As someone who
worked on this for a while,
> > > including
> > > > it
> > > > > > as
> > > > > > > > part
> > > > > > > > > > of
> > > > > > > > > > > > > drill
> > > > > > > > > > > > > > > may bloat drill
a bit too much. Also not a big
> > fan
> > > of
> > > > > > > running
> > > > > > > > > > > against
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > embedded drillbit.
Does not replicate an actual
> > > > > > production
> > > > > > > > use
> > > > > > > > > > > case.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Additionally, setting
up hive hbase and other
> > > > > components
> > > > > > > > maybe
> > > > > > > > > > > > painful
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > unnecessary for
most ppl. It would deter people
> > > from
> > > > > ever
> > > > > > > > > > > > contributing
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > drill. We could
spin up in memory hive and
> hbase
> > > but
> > > > > > that's
> > > > > > > > > > similar
> > > > > > > > > > > > to
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > embedded drill
bit. Does not replicate a
> > production
> > > > > > > scenario.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Would prefer the
hive way with a central
> Jenkins
> > > > server
> > > > > > > > hosted
> > > > > > > > > on
> > > > > > > > > > > aws
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > accessible to everyone.
 Users should be able
> to
> > > > > submit a
> > > > > > > git
> > > > > > > > > url
> > > > > > > > > > > and
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > should be able
to deploy and fire off tests.
> > Should
> > > > > then
> > > > > > > > have a
> > > > > > > > > > way
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > easily communicate
failures to contributors and
> > if
> > > > > > success
> > > > > > > > > notify
> > > > > > > > > > > the
> > > > > > > > > > > > > > > commiters to commit
the change.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ps: if hive's way
is open source maybe we can
> > look
> > > > into
> > > > > > > reuse
> > > > > > > > > > > rather
> > > > > > > > > > > > > than
> > > > > > > > > > > > > > > doing it from scratch.
Esp the Jenkins and
> > > > > configuration
> > > > > > > > stuff.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > > Ramana
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thursday, July
23, 2015, Parth Chandra <
> > > > > > > parthc@apache.org
> > > > <javascript:_e(%7B%7D,'cvml','parthc@apache.org');>
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Drill devs
use a set of tests that are not
> > > > available
> > > > > as
> > > > > > > > part
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > > Apache
> > > > > > > > > > > > > > > > distribution.
These tests are a pre-requisite
> > for
> > > > all
> > > > > > > > > commits,
> > > > > > > > > > > but
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > available
to any contributors outside the
> > current
> > > > > devs.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > This thread
is to discuss various options to
> > make
> > > > > these
> > > > > > > > tests
> > > > > > > > > > > > > > available.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Assumptions
and requirements  -
> > > > > > > > > > > > > > > > 1) A functional
test (as opposed to a unit
> > test)
> > > > > needs
> > > > > > to
> > > > > > > > be
> > > > > > > > > > > closer
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > end user environment
than a development
> > > > environment.
> > > > > As
> > > > > > > > such,
> > > > > > > > > > we
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > running functional
tests in a cluster
> > > environment,
> > > > > > > connect
> > > > > > > > > > using
> > > > > > > > > > > > > > > zookeeper
> > > > > > > > > > > > > > > > etc.
> > > > > > > > > > > > > > > > 2) Functional
test will keep increasing in
> > > number,
> > > > > get
> > > > > > > more
> > > > > > > > > > > complex
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > take a longer
and longer time to execute as
> we
> > go
> > > > > > along.
> > > > > > > > > > > > > > > > 3) Some requirements
are:
> > > > > > > > > > > > > > > >     a) We
want to be strict in enforcing the
> > > > > pre-commit
> > > > > > > > > > > > requirements,
> > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > not penalize
the contributor who has a minor
> > fix.
> > > > > > > > > > > > > > > >     b) All
parts of the product (especially
> > > various
> > > > > > > > > 'certified'
> > > > > > > > > > > > > storage
> > > > > > > > > > > > > > > > plugins like
Hive and Hbase should get
> tested)
> > > > > > > > > > > > > > > >     c) It
should be easy to debug issues
> when a
> > > > test
> > > > > > > fails.
> > > > > > > > > > Tests
> > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > fail deterministically.
If a test fails, it
> > > should
> > > > > > always
> > > > > > > > > fail
> > > > > > > > > > > and
> > > > > > > > > > > > > > always
> > > > > > > > > > > > > > > > fail in the
same way (easier said than done).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Some suggestions
-
> > > > > > > > > > > > > > > > 1) Tests should
be a top-level maven module
> > > within
> > > > > the
> > > > > > > > drill
> > > > > > > > > > > > project
> > > > > > > > > > > > > > > >         a)
We want  the integration tests to
> > run
> > > as
> > > > > > part
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > > drill's
> > > > > > > > > > > > > > > > maven build
process
> > > > > > > > > > > > > > > >         b)
The build step for the
> > > integration-tests
> > > > > > > module
> > > > > > > > > > would
> > > > > > > > > > > > > launch
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > embedded drillbit
and runs tests against it
> > > > > > > > > > > > > > > >         c)
The tests will be a separate
> target
> > so
> > > > > they
> > > > > > > need
> > > > > > > > > not
> > > > > > > > > > > be
> > > > > > > > > > > > > run
> > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > the time
> > > > > > > > > > > > > > > >  2) Tests
should be divided into multiple
> > suites
> > > > that
> > > > > > are
> > > > > > > > > based
> > > > > > > > > > > on
> > > > > > > > > > > > > > > > components.
For example a test suite for
> > testing
> > > > > > > datatypes
> > > > > > > > > will
> > > > > > > > > > > > > contain
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > tests for
various datatypes including complex
> > > > types.
> > > > > A
> > > > > > > > > > > contributor
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > > developer
can then run these tests more
> > > frequently
> > > > as
> > > > > > an
> > > > > > > > > issue
> > > > > > > > > > is
> > > > > > > > > > > > > being
> > > > > > > > > > > > > > > > addressed
and run the entire suite only once
> > > before
> > > > > > > commit.
> > > > > > > > > > > > > > > > 3) Provide
the tests as a hosted service
> > > > > > > > > > > > > > > > 4) Setup a
bot to fire the test on an AWS
> > cluster
> > > > and
> > > > > > > post
> > > > > > > > > the
> > > > > > > > > > > > > results
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > the JIRA 
(Hive does this). Or some variant
> of
> > > this
> > > > > > idea.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Some questions
-
> > > > > > > > > > > > > > > > 1) What do
some other projects do?
> > > > > > > > > > > > > > > > 2) Are there
any technologies we can leverage
> > > that
> > > > > will
> > > > > > > > make
> > > > > > > > > > this
> > > > > > > > > > > > > > easier?
> > > > > > > > > > > > > > > > 3) How do
we make it easier to debug failing
> > > tests.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Please feel
free to question the assumptions
> > and
> > > > > > > > > requirements.
> > > > > > > > > > Be
> > > > > > > > > > > > > > > creative
> > > > > > > > > > > > > > > > with your
suggestions.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Parth
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message