drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@dremio.com>
Subject Re: Potential resource for large scale testing
Date Fri, 18 Sep 2015 21:24:59 GMT
Not offhand. It really depends on how the time would work. For example, it
would be nice if we had an automated perfectly fressh (no .m2/repo) nightly
build and full test suite run so people can always check the status. Maybe
we use this hardware for that?

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Fri, Sep 18, 2015 at 9:48 AM, rahul challapalli <
challapallirahul@gmail.com> wrote:

> Edmon,
>
> We do have the tests available now [1].
>
> Jacques,
>
> You expressed interest in making these tests available on an Amazon cluster
> so that users need not have physical hardware required to run these tests.
> Do you have any specific thoughts on how to leverage the resources that
> Edmon is willing to contribute (performance testing?)
>
>
> [1] https://github.com/mapr/drill-test-framework
>
> - Rahul
>
> On Thu, Sep 17, 2015 at 8:49 PM, Edmon Begoli <ebegoli@gmail.com> wrote:
>
> > I discussed this idea of bringing large compute resource yesterday with
> my
> > team at JICS to the project, and there was a general consensus that it
> can
> > be committed.
> >
> > I will request and hopefully commit pretty large set of
> > clustered CPU/storage resources for the needs of a Drill project.
> >
> > I will be the PI for the resource, and could give access to whomever we
> > want to designate from the Drill project side.
> >
> > Just let me know. I should have project approved within few days.
> >
> > Edmon
> >
> >
> > On Saturday, September 5, 2015, Edmon Begoli <ebegoli@gmail.com> wrote:
> >
> > > Ted,
> > >
> > > It is actually very easy and painless to do what I am proposing. I
> > > probably made it sound far more bureaucratic/legalistic than it really
> > is.
> > >
> > > Researchers and projects from across the globe can apply for cycles on
> > > Beacon or any other HPC platform we run. (Beacon is by far the best and
> > we
> > > already have a setup to run Spark and Hive on it. (We just published
> > paper
> > > about it at XSEDE on integrating PBS/TORQUE scheduler with Spark to run
> > > JVM-bound jobs))
> > >
> > > As for use of resources, at the end of year we need to submit reports
> for
> > > all the projects that used compute resources and how.
> > > It is part of our mission, as being one of the XSEDE centers, to
> > > help promote the advancement of the science and technology.
> > > Reports from Principal Investigators (PI) show how we did it. In this
> > > case, I can be a PI and have any/someone from the Drill team assigned
> > > access.
> > >
> > > I don't think there are any IP issues. Open source project, open
> research
> > > institution, use of resources for testing and benchmarking. We could
> > > actually make JICS a benchmarking site for Drill (and even other Apache
> > > projects).
> > >
> > > We'll discuss other details in a hangout. I am also planning to brief
> my
> > > team next Wednesday on the plan for the use of resources.
> > >
> > > Regards,
> > > Edmon
> > >
> > >
> > > On Saturday, September 5, 2015, Ted Dunning <ted.dunning@gmail.com
> > > <javascript:_e(%7B%7D,'cvml','ted.dunning@gmail.com');>> wrote:
> > >
> > >> Edmon,
> > >>
> > >> This is very interesting.  I am sure that public acknowledgements of
> > >> contributions are easily managed.
> > >>
> > >> What might be even more useful for you would be small scale
> > publications,
> > >> especially about the problems of shoe-horning real-world data objects
> > into
> > >> the quasi-relational model of Drill.
> > >>
> > >> What would be problematic (and what is probably just a matter of
> > >> nomenclature) is naming of an institution by the Apache specific term
> > >> "committer" (you said commitment). Individuals at your institution
> would
> > >> absolutely be up for being committers as they demonstrate a track
> record
> > >> of
> > >> contribution.
> > >>
> > >> I would expect no need for any paperwork between JICS and Apache
> unless
> > >> you
> > >> would like to execute a corporate contributor license to ensure that
> > >> particular individuals are specifically empowered to contribute code.
> I
> > >> don't know that the position of JICS is relative to intellectual
> > property,
> > >> though, so it might be worth checking out institutional policy on your
> > >> side
> > >> on how individuals can contribute to open source projects. It
> shouldn't
> > be
> > >> too hard since there are quite a number of NSF funded people who do
> > >> contribute.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Fri, Sep 4, 2015 at 9:39 PM, Edmon Begoli <ebegoli@gmail.com>
> wrote:
> > >>
> > >> > I can work with my institution and the NSF that we committ the time
> on
> > >> the
> > >> > Beacon supercomputing cluster to Apache and the Drill project. Maybe
> > 20
> > >> > hours a month for 4-5 nodes.
> > >> >
> > >> > I have discretionary hours that I can put in, and I can, with our
> > >> > HPC admins, create deploy scripts on few clustered machines (these
> are
> > >> all
> > >> > very large boxes with 16 cores, 256 GB, 40gb IB interconnect, and
> > >> > with local 1 TB SSD each). There is also Medusa 10 PB filesystem
> > >> attached
> > >> > but HDFS over local drives would probably be better.
> > >> > They are otherwise just a regular machines, and run regular JVMs on
> > >> Linux.
> > >> >
> > >> > We can also get Rahul an access with a secure token to setup
> > >> > and run stress/performance/integration tests for Drill. I can
> actually
> > >> help
> > >> > there as well. This can be automated to run tests and collect
> results.
> > >> >
> > >> > I think that the only requirement would be that the JICS team be
> named
> > >> for
> > >> > commitment because both NSF/XSEDE and UT like to see the resources
> > >> > being officially used and acknowledged. They are there to support
> open
> > >> and
> > >> > academic research; open source projects fit well.
> > >> >
> > >> > If this sounds OK with the project PMCs, I can start the process of
> > >> > allocation, accounts creation, setup.
> > >> >
> > >> > I would also, as a CDO, of JICS sign whatever standard papers with
> > >> > the Apache organization.
> > >> >
> > >> > With all this being said, let me know please if this is something
we
> > >> want
> > >> > to pursue.
> > >> >
> > >> > Thank you,
> > >> > Edmon
> > >> >
> > >> > On Tuesday, September 1, 2015, Jacques Nadeau <jacques@dremio.com>
> > >> wrote:
> > >> >
> > >> > > I spent a bunch of time looking at the Phi coprocessors and forgot
> > to
> > >> get
> > >> > > back to the thread. I'd love it if someone spent some time looking
> > at
> > >> > > leveraging them (since Drill is frequently processor bound).
 Any
> > >> takers?
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Jacques Nadeau
> > >> > > CTO and Co-Founder, Dremio
> > >> > >
> > >> > > On Mon, Aug 31, 2015 at 10:24 PM, Parth Chandra <
> parthc@apache.org
> > >> > > <javascript:;>> wrote:
> > >> > >
> > >> > > > Hi Edmon,
> > >> > > >   Sorry no one seems to have got back to you on this.
> > >> > > >   We are in the process of publishing a test suite for
> regression
> > >> > testing
> > >> > > > Drill and the cluster you have (even a few nodes ) would
be a
> > great
> > >> > > > resource for folks to run the test suite. Rahul, et al are
> working
> > >> on
> > >> > > this
> > >> > > > and I would suggest watching out for Rahul's posts on the
topic.
> > >> > > >
> > >> > > > Parth
> > >> > > >
> > >> > > > On Tue, Aug 25, 2015 at 9:55 PM, Edmon Begoli <
> ebegoli@gmail.com
> > >> > > <javascript:;>> wrote:
> > >> > > >
> > >> > > > > Hey folks,
> > >> > > > >
> > >> > > > > As we discussed today on a hangout, this is a machine
that we
> > >> have at
> > >> > > > > JICS/NICS
> > >> > > > > where I have Drill installed and where I could set
up a test
> > >> cluster
> > >> > > over
> > >> > > > > few nodes.
> > >> > > > >
> > >> > > > >
> > >> > >
> > >>
> https://www.nics.tennessee.edu/computing-resources/beacon/configuration
> > >> > > > >
> > >> > > > > Note that each node is:
> > >> > > > > - 2x8-core Intel® Xeon® E5-2670 processors
> > >> > > > > - 256 GB of memory
> > >> > > > > - 4 Intel® Xeon Phi™ coprocessors 5110P with 8 GB
of memory
> each
> > >> > > > > - 960 GB of SSD storage
> > >> > > > >
> > >> > > > > Would someone advise on what would be an interesting
test
> setup?
> > >> > > > >
> > >> > > > > Thank you,
> > >> > > > > Edmon
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message