hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Awhan Patnaik <aw...@spotzot.com>
Subject Error: Java heap space
Date Thu, 03 Dec 2015 15:08:12 GMT
I am dealing with a table that has 71679920 records and 30 columns. It
occupies about 13Gigs on HDFS. Two columns of this table contain latitude
and longitude (both double) and another column contains geohash (12 char
long strings). I am trying to index this table in Hive.

I run the following two queries in Hive shell:
create index idx on table vzt_oct (slocnhash) as 'COMPACT' with deferred
rebuild;
alter index idx on vzt_oct rebuild;

but I run up against Java Heap Space error. I have tried setting higher and
higher values for mapreduce.map.memory.mb and mapreduce.reduce.memory.mb.
The default value is 1024. I bump them up to 2000, 4000, 6000 upto 10000
after which I run in to Job execution error.

Number of mappers are 50 and number of reducers are 54. I tried reducing
the number of reducers as well but could not fix the problem.

I have an EC2 hosted 3 node cluster composed of c3.2xlarge instances.
Hadoop version 2.7.0 and Hive version 1.2.1. Each machine has 16Gigs of RAM.

Questions:
1) Which parameters should I fiddle with?

2) Where are the error logs? $HADOOP_HOME/logs/userlogs?

3) Why is the tracking URL (line 14 in the attached log) not available
after the job fails?

4) Why is the taskdetails.jsp page (line 122 in the attached log) never
available?

Mime
View raw message