drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Seddon <mr.tom.sed...@gmail.com>
Subject Re: Drill Masters Project
Date Thu, 29 Aug 2013 09:25:20 GMT
Thanks Jacques.  I'm very happy to get involved and share my experiences.

I'm looking for the best way to set up a cluster now.  In terms of
evaluating Drill's performance, do you think it's especially important to
have a system that would be close in performance to a production cluster,
or would it be worthwhile exploring it on a small scale?  Problem being a
student, my budget is limited, so I'm exploring things like Raspberry Pi
clusters, which I think don't have linear performance improvements as you
scale out.  I'm also enquiring about EC2 or GCE student licensing.

On 29 August 2013 05:08, Jacques Nadeau <jacques@apache.org> wrote:

> A Hadoop cluster would be a good start.  We're in the process right now of
> putting together distributable files which will help get you to up to speed
> quickly.  Contribution isn't just code, there are many types and I'm sure
> you can help in any number of ways.  Just documenting your early
> experiences and advice would be a great way to start helping out.
> Jacques
> On Sun, Aug 25, 2013 at 1:25 PM, Tom Seddon <mr.tom.seddon@gmail.com>
> wrote:
> > Hi,
> >
> > I'm looking to do a dissertation on Drill, as part of masters degree in
> > Data Science.  I'm hoping to set up a cluster to run it and then analyse
> > its efficiency with different datasets, as well as make recommendations
> for
> > its usage. I know Drill is in a fairly early stage of development but I
> > have around 18 months until the project is due, so I'm hoping the timing
> > will work as Drill is developed further.
> >
> > I'd be grateful for any advice on how I could get started on this.
>  Would a
> > Hadoop cluster be a good back-end to base my project on or would
> something
> > more suited to nested data like MongoDB be more appropriate?  Also, I
> > haven't found much documentation on configuring Drill in a distributed
> > environment, so any help on this would be appreciated.
> >
> > I'd also be willing to contribute but not sure if I have enough Java
> > experience.  My background is mainly in BI and database technologies.
> >
> > Thanks,
> >
> > Tom
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message