hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Loren Siebert <>
Subject Re: Single Map task for Hive queries
Date Mon, 15 Aug 2011 18:00:18 GMT
You should not have to do anything special to Hive to make it use all of your TT’s. The actual
MR job should be governed by your mapred-site.xml file.

When you run sample MR jobs (like the Pi example) and look at the job tracker, are you seeing
all your TT’s getting used?

On Aug 15, 2011, at 10:47 AM, Jon Bender wrote:

> It's actually just an uncompressed UTF-8 text file.
> This was essentially the create table clause:
> LOCATION '/data/foo'
> Using Hive 0.7.
> On Mon, Aug 15, 2011 at 10:37 AM, Loren Siebert <> wrote:
> Is your external file compressed with GZip or BZip? Those file formats aren’t splittable,
so they get assigned to one mapper.
> On Aug 15, 2011, at 10:23 AM, Jon Bender wrote:
> > Hello,
> >
> > I have external tables in Hive stored in a single flat text file.  When I execute
queries against it, all of my jobs are run as a single map task, even on very large tables.
> >
> > What steps do I need to make to ensure that these queries are split up and pushed
out to multiple TTs?  Do I need to store the Hive tables in a different internal file format?
 Make some configuration changes?
> >
> > Thanks!
> > Jon

View raw message