Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4021D175AD for ; Fri, 18 Sep 2015 21:25:12 +0000 (UTC) Received: (qmail 79873 invoked by uid 500); 18 Sep 2015 21:25:12 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 79817 invoked by uid 500); 18 Sep 2015 21:25:11 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 79806 invoked by uid 99); 18 Sep 2015 21:25:11 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Sep 2015 21:25:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 3FEB4F40C6 for ; Fri, 18 Sep 2015 21:25:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.001 X-Spam-Level: **** X-Spam-Status: No, score=4.001 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, KAM_LAZY_DOMAIN_SECURITY=1, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id j6NlH1gRwmhR for ; Fri, 18 Sep 2015 21:25:00 +0000 (UTC) Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 9F44C206E6 for ; Fri, 18 Sep 2015 21:24:59 +0000 (UTC) Received: by wicfx3 with SMTP id fx3so45808115wic.0 for ; Fri, 18 Sep 2015 14:24:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=djeKD1s3i19Ft8eE9zewdox8Rcgb06KRJHKA06HSVbY=; b=E+B8/2uKISF1Gsry972LXPAGZPYVdz1OCS7Exm4Faf/txCbGsYXsfxg6hCpDxQYZVy aU+1y/zKYkBOmy7+O7Oh7tmPjtq8/5Mtl3cFR8REOhQVp3CIh8ZvxLaZ3SzDgQL17gVG yzVXtaETkSUYk3QOzKQSn9SvBusRbAFhQg6CLR7tvvBh0N3Gm+8gp6OUjCIazS7YFqq1 yyQe+wf7k0aJGeBG9G2nCQLfHIW7L+5zH7yPp35bXfMeOGFiqSPHOFkYDeG62MO6B49S oWHR2G0iwTKB79vcK848JbJowSE0V9NKS6iZ9qRgmfbOxHnNyIwmm/4H5zhPn4gWDW1P Gg9w== X-Gm-Message-State: ALoCoQkpNB7Hn7sqzqvZ5imARcxUiwUL/dJ4h9/WIPEHP+gfwUlYnHcoxWqiW5amLmuXCIwEbIv+ MIME-Version: 1.0 X-Received: by 10.180.103.167 with SMTP id fx7mr277062wib.89.1442611499239; Fri, 18 Sep 2015 14:24:59 -0700 (PDT) Received: by 10.27.101.10 with HTTP; Fri, 18 Sep 2015 14:24:59 -0700 (PDT) In-Reply-To: References: Date: Fri, 18 Sep 2015 14:24:59 -0700 Message-ID: Subject: Re: Potential resource for large scale testing From: Jacques Nadeau To: dev@drill.apache.org Content-Type: multipart/alternative; boundary=f46d04428674f620af05200c29d6 --f46d04428674f620af05200c29d6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Not offhand. It really depends on how the time would work. For example, it would be nice if we had an automated perfectly fressh (no .m2/repo) nightly build and full test suite run so people can always check the status. Maybe we use this hardware for that? -- Jacques Nadeau CTO and Co-Founder, Dremio On Fri, Sep 18, 2015 at 9:48 AM, rahul challapalli < challapallirahul@gmail.com> wrote: > Edmon, > > We do have the tests available now [1]. > > Jacques, > > You expressed interest in making these tests available on an Amazon clust= er > so that users need not have physical hardware required to run these tests= . > Do you have any specific thoughts on how to leverage the resources that > Edmon is willing to contribute (performance testing?) > > > [1] https://github.com/mapr/drill-test-framework > > - Rahul > > On Thu, Sep 17, 2015 at 8:49 PM, Edmon Begoli wrote: > > > I discussed this idea of bringing large compute resource yesterday with > my > > team at JICS to the project, and there was a general consensus that it > can > > be committed. > > > > I will request and hopefully commit pretty large set of > > clustered CPU/storage resources for the needs of a Drill project. > > > > I will be the PI for the resource, and could give access to whomever we > > want to designate from the Drill project side. > > > > Just let me know. I should have project approved within few days. > > > > Edmon > > > > > > On Saturday, September 5, 2015, Edmon Begoli wrote: > > > > > Ted, > > > > > > It is actually very easy and painless to do what I am proposing. I > > > probably made it sound far more bureaucratic/legalistic than it reall= y > > is. > > > > > > Researchers and projects from across the globe can apply for cycles o= n > > > Beacon or any other HPC platform we run. (Beacon is by far the best a= nd > > we > > > already have a setup to run Spark and Hive on it. (We just published > > paper > > > about it at XSEDE on integrating PBS/TORQUE scheduler with Spark to r= un > > > JVM-bound jobs)) > > > > > > As for use of resources, at the end of year we need to submit reports > for > > > all the projects that used compute resources and how. > > > It is part of our mission, as being one of the XSEDE centers, to > > > help promote the advancement of the science and technology. > > > Reports from Principal Investigators (PI) show how we did it. In this > > > case, I can be a PI and have any/someone from the Drill team assigned > > > access. > > > > > > I don't think there are any IP issues. Open source project, open > research > > > institution, use of resources for testing and benchmarking. We could > > > actually make JICS a benchmarking site for Drill (and even other Apac= he > > > projects). > > > > > > We'll discuss other details in a hangout. I am also planning to brief > my > > > team next Wednesday on the plan for the use of resources. > > > > > > Regards, > > > Edmon > > > > > > > > > On Saturday, September 5, 2015, Ted Dunning > > > wrote: > > > > > >> Edmon, > > >> > > >> This is very interesting. I am sure that public acknowledgements of > > >> contributions are easily managed. > > >> > > >> What might be even more useful for you would be small scale > > publications, > > >> especially about the problems of shoe-horning real-world data object= s > > into > > >> the quasi-relational model of Drill. > > >> > > >> What would be problematic (and what is probably just a matter of > > >> nomenclature) is naming of an institution by the Apache specific ter= m > > >> "committer" (you said commitment). Individuals at your institution > would > > >> absolutely be up for being committers as they demonstrate a track > record > > >> of > > >> contribution. > > >> > > >> I would expect no need for any paperwork between JICS and Apache > unless > > >> you > > >> would like to execute a corporate contributor license to ensure that > > >> particular individuals are specifically empowered to contribute code= . > I > > >> don't know that the position of JICS is relative to intellectual > > property, > > >> though, so it might be worth checking out institutional policy on yo= ur > > >> side > > >> on how individuals can contribute to open source projects. It > shouldn't > > be > > >> too hard since there are quite a number of NSF funded people who do > > >> contribute. > > >> > > >> > > >> > > >> > > >> > > >> On Fri, Sep 4, 2015 at 9:39 PM, Edmon Begoli > wrote: > > >> > > >> > I can work with my institution and the NSF that we committ the tim= e > on > > >> the > > >> > Beacon supercomputing cluster to Apache and the Drill project. May= be > > 20 > > >> > hours a month for 4-5 nodes. > > >> > > > >> > I have discretionary hours that I can put in, and I can, with our > > >> > HPC admins, create deploy scripts on few clustered machines (these > are > > >> all > > >> > very large boxes with 16 cores, 256 GB, 40gb IB interconnect, and > > >> > with local 1 TB SSD each). There is also Medusa 10 PB filesystem > > >> attached > > >> > but HDFS over local drives would probably be better. > > >> > They are otherwise just a regular machines, and run regular JVMs o= n > > >> Linux. > > >> > > > >> > We can also get Rahul an access with a secure token to setup > > >> > and run stress/performance/integration tests for Drill. I can > actually > > >> help > > >> > there as well. This can be automated to run tests and collect > results. > > >> > > > >> > I think that the only requirement would be that the JICS team be > named > > >> for > > >> > commitment because both NSF/XSEDE and UT like to see the resources > > >> > being officially used and acknowledged. They are there to support > open > > >> and > > >> > academic research; open source projects fit well. > > >> > > > >> > If this sounds OK with the project PMCs, I can start the process o= f > > >> > allocation, accounts creation, setup. > > >> > > > >> > I would also, as a CDO, of JICS sign whatever standard papers with > > >> > the Apache organization. > > >> > > > >> > With all this being said, let me know please if this is something = we > > >> want > > >> > to pursue. > > >> > > > >> > Thank you, > > >> > Edmon > > >> > > > >> > On Tuesday, September 1, 2015, Jacques Nadeau > > >> wrote: > > >> > > > >> > > I spent a bunch of time looking at the Phi coprocessors and forg= ot > > to > > >> get > > >> > > back to the thread. I'd love it if someone spent some time looki= ng > > at > > >> > > leveraging them (since Drill is frequently processor bound). An= y > > >> takers? > > >> > > > > >> > > > > >> > > > > >> > > -- > > >> > > Jacques Nadeau > > >> > > CTO and Co-Founder, Dremio > > >> > > > > >> > > On Mon, Aug 31, 2015 at 10:24 PM, Parth Chandra < > parthc@apache.org > > >> > > > wrote: > > >> > > > > >> > > > Hi Edmon, > > >> > > > Sorry no one seems to have got back to you on this. > > >> > > > We are in the process of publishing a test suite for > regression > > >> > testing > > >> > > > Drill and the cluster you have (even a few nodes ) would be a > > great > > >> > > > resource for folks to run the test suite. Rahul, et al are > working > > >> on > > >> > > this > > >> > > > and I would suggest watching out for Rahul's posts on the topi= c. > > >> > > > > > >> > > > Parth > > >> > > > > > >> > > > On Tue, Aug 25, 2015 at 9:55 PM, Edmon Begoli < > ebegoli@gmail.com > > >> > > > wrote: > > >> > > > > > >> > > > > Hey folks, > > >> > > > > > > >> > > > > As we discussed today on a hangout, this is a machine that w= e > > >> have at > > >> > > > > JICS/NICS > > >> > > > > where I have Drill installed and where I could set up a test > > >> cluster > > >> > > over > > >> > > > > few nodes. > > >> > > > > > > >> > > > > > > >> > > > > >> > https://www.nics.tennessee.edu/computing-resources/beacon/configuration > > >> > > > > > > >> > > > > Note that each node is: > > >> > > > > - 2x8-core Intel=C2=AE Xeon=C2=AE E5-2670 processors > > >> > > > > - 256 GB of memory > > >> > > > > - 4 Intel=C2=AE Xeon Phi=E2=84=A2 coprocessors 5110P with 8 = GB of memory > each > > >> > > > > - 960 GB of SSD storage > > >> > > > > > > >> > > > > Would someone advise on what would be an interesting test > setup? > > >> > > > > > > >> > > > > Thank you, > > >> > > > > Edmon > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > --f46d04428674f620af05200c29d6--