incubator-drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tridib Samanta <tridib.sama...@live.com>
Subject RE: Enable caching in Drill
Date Sat, 01 Nov 2014 06:48:17 GMT
I tested with 1 million and 12 million file. but what I understood from earlier reply is that
json files can't be split between worker while reading and simple calculation can't be faster
because of that.
 
> From: ted.dunning@gmail.com
> Date: Fri, 31 Oct 2014 15:53:12 -0700
> Subject: Re: Enable caching in Drill
> To: drill-user@incubator.apache.org
> 
> By default, Drill only splits input on fairly large chunks (100,000
> records, I think).
> 
> How many records in your input?
> 
> 
> 
> On Wed, Oct 29, 2014 at 9:46 PM, Tridib Samanta <tridib.samanta@live.com>
> wrote:
> 
> > select count(*) from myhdfs.json.`x00.json`;
> >
> > Surprising thing is, I get same performance when I use 1 drillbit compare
> > to 4 drillbits.
> >
> > > Date: Thu, 30 Oct 2014 10:08:04 +0530
> > > Subject: Re: Enable caching in Drill
> > > From: mufeed.usman@gmail.com
> > > To: drill-user@incubator.apache.org
> > >
> > > The query didn't get through :-).
> > >
> > >
> > > ---
> > > Mufeed Usman
> > > My LinkedIn <http://www.linkedin.com/pub/mufeed-usman/28/254/400> | My
> > > Social Cause <http://www.vision2016.org.in/> | My Blogs : LiveJournal
> > > <http://mufeed.livejournal.com>
> > >
> > >
> > >
> > >
> > > On Thu, Oct 30, 2014 at 2:54 AM, Tridib Samanta <tridib.samanta@live.com
> > >
> > > wrote:
> > >
> > > > Hello,
> > > > I am doing a count query like bellow. I understand that it will take
> > long
> > > > time at first attempt. But not sure why it takes same time in
> > subsequent
> > > > execution. Will I have to enable caching or something like that?
> > > >
> > > > Thanks
> > > > Tridib
> > > >
> >
> >
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message