drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul challapalli <challapallira...@gmail.com>
Subject Re: Potential resource for large scale testing
Date Fri, 18 Sep 2015 16:48:43 GMT
Edmon,

We do have the tests available now [1].

Jacques,

You expressed interest in making these tests available on an Amazon cluster
so that users need not have physical hardware required to run these tests.
Do you have any specific thoughts on how to leverage the resources that
Edmon is willing to contribute (performance testing?)


[1] https://github.com/mapr/drill-test-framework

- Rahul

On Thu, Sep 17, 2015 at 8:49 PM, Edmon Begoli <ebegoli@gmail.com> wrote:

> I discussed this idea of bringing large compute resource yesterday with my
> team at JICS to the project, and there was a general consensus that it can
> be committed.
>
> I will request and hopefully commit pretty large set of
> clustered CPU/storage resources for the needs of a Drill project.
>
> I will be the PI for the resource, and could give access to whomever we
> want to designate from the Drill project side.
>
> Just let me know. I should have project approved within few days.
>
> Edmon
>
>
> On Saturday, September 5, 2015, Edmon Begoli <ebegoli@gmail.com> wrote:
>
> > Ted,
> >
> > It is actually very easy and painless to do what I am proposing. I
> > probably made it sound far more bureaucratic/legalistic than it really
> is.
> >
> > Researchers and projects from across the globe can apply for cycles on
> > Beacon or any other HPC platform we run. (Beacon is by far the best and
> we
> > already have a setup to run Spark and Hive on it. (We just published
> paper
> > about it at XSEDE on integrating PBS/TORQUE scheduler with Spark to run
> > JVM-bound jobs))
> >
> > As for use of resources, at the end of year we need to submit reports for
> > all the projects that used compute resources and how.
> > It is part of our mission, as being one of the XSEDE centers, to
> > help promote the advancement of the science and technology.
> > Reports from Principal Investigators (PI) show how we did it. In this
> > case, I can be a PI and have any/someone from the Drill team assigned
> > access.
> >
> > I don't think there are any IP issues. Open source project, open research
> > institution, use of resources for testing and benchmarking. We could
> > actually make JICS a benchmarking site for Drill (and even other Apache
> > projects).
> >
> > We'll discuss other details in a hangout. I am also planning to brief my
> > team next Wednesday on the plan for the use of resources.
> >
> > Regards,
> > Edmon
> >
> >
> > On Saturday, September 5, 2015, Ted Dunning <ted.dunning@gmail.com
> > <javascript:_e(%7B%7D,'cvml','ted.dunning@gmail.com');>> wrote:
> >
> >> Edmon,
> >>
> >> This is very interesting.  I am sure that public acknowledgements of
> >> contributions are easily managed.
> >>
> >> What might be even more useful for you would be small scale
> publications,
> >> especially about the problems of shoe-horning real-world data objects
> into
> >> the quasi-relational model of Drill.
> >>
> >> What would be problematic (and what is probably just a matter of
> >> nomenclature) is naming of an institution by the Apache specific term
> >> "committer" (you said commitment). Individuals at your institution would
> >> absolutely be up for being committers as they demonstrate a track record
> >> of
> >> contribution.
> >>
> >> I would expect no need for any paperwork between JICS and Apache unless
> >> you
> >> would like to execute a corporate contributor license to ensure that
> >> particular individuals are specifically empowered to contribute code. I
> >> don't know that the position of JICS is relative to intellectual
> property,
> >> though, so it might be worth checking out institutional policy on your
> >> side
> >> on how individuals can contribute to open source projects. It shouldn't
> be
> >> too hard since there are quite a number of NSF funded people who do
> >> contribute.
> >>
> >>
> >>
> >>
> >>
> >> On Fri, Sep 4, 2015 at 9:39 PM, Edmon Begoli <ebegoli@gmail.com> wrote:
> >>
> >> > I can work with my institution and the NSF that we committ the time on
> >> the
> >> > Beacon supercomputing cluster to Apache and the Drill project. Maybe
> 20
> >> > hours a month for 4-5 nodes.
> >> >
> >> > I have discretionary hours that I can put in, and I can, with our
> >> > HPC admins, create deploy scripts on few clustered machines (these are
> >> all
> >> > very large boxes with 16 cores, 256 GB, 40gb IB interconnect, and
> >> > with local 1 TB SSD each). There is also Medusa 10 PB filesystem
> >> attached
> >> > but HDFS over local drives would probably be better.
> >> > They are otherwise just a regular machines, and run regular JVMs on
> >> Linux.
> >> >
> >> > We can also get Rahul an access with a secure token to setup
> >> > and run stress/performance/integration tests for Drill. I can actually
> >> help
> >> > there as well. This can be automated to run tests and collect results.
> >> >
> >> > I think that the only requirement would be that the JICS team be named
> >> for
> >> > commitment because both NSF/XSEDE and UT like to see the resources
> >> > being officially used and acknowledged. They are there to support open
> >> and
> >> > academic research; open source projects fit well.
> >> >
> >> > If this sounds OK with the project PMCs, I can start the process of
> >> > allocation, accounts creation, setup.
> >> >
> >> > I would also, as a CDO, of JICS sign whatever standard papers with
> >> > the Apache organization.
> >> >
> >> > With all this being said, let me know please if this is something we
> >> want
> >> > to pursue.
> >> >
> >> > Thank you,
> >> > Edmon
> >> >
> >> > On Tuesday, September 1, 2015, Jacques Nadeau <jacques@dremio.com>
> >> wrote:
> >> >
> >> > > I spent a bunch of time looking at the Phi coprocessors and forgot
> to
> >> get
> >> > > back to the thread. I'd love it if someone spent some time looking
> at
> >> > > leveraging them (since Drill is frequently processor bound).  Any
> >> takers?
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Jacques Nadeau
> >> > > CTO and Co-Founder, Dremio
> >> > >
> >> > > On Mon, Aug 31, 2015 at 10:24 PM, Parth Chandra <parthc@apache.org
> >> > > <javascript:;>> wrote:
> >> > >
> >> > > > Hi Edmon,
> >> > > >   Sorry no one seems to have got back to you on this.
> >> > > >   We are in the process of publishing a test suite for regression
> >> > testing
> >> > > > Drill and the cluster you have (even a few nodes ) would be a
> great
> >> > > > resource for folks to run the test suite. Rahul, et al are working
> >> on
> >> > > this
> >> > > > and I would suggest watching out for Rahul's posts on the topic.
> >> > > >
> >> > > > Parth
> >> > > >
> >> > > > On Tue, Aug 25, 2015 at 9:55 PM, Edmon Begoli <ebegoli@gmail.com
> >> > > <javascript:;>> wrote:
> >> > > >
> >> > > > > Hey folks,
> >> > > > >
> >> > > > > As we discussed today on a hangout, this is a machine that
we
> >> have at
> >> > > > > JICS/NICS
> >> > > > > where I have Drill installed and where I could set up a
test
> >> cluster
> >> > > over
> >> > > > > few nodes.
> >> > > > >
> >> > > > >
> >> > >
> >> https://www.nics.tennessee.edu/computing-resources/beacon/configuration
> >> > > > >
> >> > > > > Note that each node is:
> >> > > > > - 2x8-core Intel® Xeon® E5-2670 processors
> >> > > > > - 256 GB of memory
> >> > > > > - 4 Intel® Xeon Phi™ coprocessors 5110P with 8 GB of
memory each
> >> > > > > - 960 GB of SSD storage
> >> > > > >
> >> > > > > Would someone advise on what would be an interesting test
setup?
> >> > > > >
> >> > > > > Thank you,
> >> > > > > Edmon
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message