hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Chu <pete....@outlook.com>
Subject Map Join Problems
Date Tue, 28 May 2013 03:17:54 GMT
Using Hive 0.8.1 on Amazon EMR Hadoop Job.
Some problems with using mapjoin:
1) Exceed memory, I got the following errors.  Then I remove mapjoin in the query and instead
set hive.auto.convert.join=true, thinking that let hive decides when mapjoin is suitable.
 It does run much farther in the job, but then another similar error towards the end.
2) The I tried with mapjoin in the same query before and then set hive.mapjoin.localtask.max.memory.usage=3,
same exact error.
My questions is that is there any other settings I can use to increase mapjoin memory or hashtable
size?  Or is there any other better options?
2013-05-25 11:37:39	Starting to launch local task to process map join;	maximum memory = 932118528
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/hive-0.8.1/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2013-05-25 11:37:57	Processing rows:	200000	Hashtable size:	199999	Memory usage:	776687416
rate:	0.833
2013-05-25 11:38:00	Processing rows:	215031	Hashtable size:	215031	Memory usage:	813018320
rate:	0.872
2013-05-25 11:38:00	Dump the hashtable into file: file:/tmp/hadoop/hive_2013-05-25_23-37-37_320_2027014861824847272/-local-10006/HashTable-Stage-6/MapJoin-bu-21--.hashtable
Execution failed with exit status: 2
Obtaining error information

Task failed!
Task ID:
  Stage-10

Logs:
3) Please look at the four errors example below.  My other question is that of all the runs
with that mapjoin error, there is a pattern, the mapjoin is done on bu, which is a job before
the error, all the error happens just shy of 4 rows of the bu table mapjoin, I found this
too much of a coincidence, can someone please offer some insight?:#1:215035 Rows loaded to
hdfs://10.190.182.26:9000/mnt/var/lib/hive_081/tmp/scratch/hive_2013-05-25_23-35-49_100_9059150281675034748/-ext-10000
MapReduce Jobs Launched: 
Job 0: Map: 18  Reduce: 13   Accumulative CPU: 139.31 sec   HDFS Read: 1226414348 HDFS Write:
2179 SUCCESS
Job 1: Map: 9   Accumulative CPU: 54.1 sec   HDFS Read: 687306237 HDFS Write: 695722636 SUCCESS
Job 2: Map: 16   Accumulative CPU: 89.09 sec   HDFS Read: 695838641 HDFS Write: 703096594
SUCCESS
Total MapReduce CPU Time Spent: 4 minutes 42 seconds 500 msec
OK
Time taken: 108.206 seconds
OK
Time taken: 0.013 seconds
Total MapReduce jobs = 3
Execution log at: /tmp/hadoop/hadoop_20130525233737_8911fdca-6536-45bb-aac3-19b92b3de99c.log
2013-05-25 11:37:39	Starting to launch local task to process map join;	maximum memory = 932118528
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/hive-0.8.1/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2013-05-25 11:37:57	Processing rows:	200000	Hashtable size:	199999	Memory usage:	776687416
rate:	0.833
2013-05-25 11:38:00	Processing rows:	215031	Hashtable size:	215031	Memory usage:	813018320
rate:	0.872
2013-05-25 11:38:00	Dump the hashtable into file: file:/tmp/hadoop/hive_2013-05-25_23-37-37_320_2027014861824847272/-local-10006/HashTable-Stage-6/MapJoin-bu-21--.hashtable
Execution failed with exit status: 2
Obtaining error information

Task failed!
Task ID:
  Stage-10

Logs:#2Table default.badurls stats: [num_partitions: 0, num_files: 18, num_rows: 0, total_size:
701922144, raw_data_size: 0]
214618 Rows loaded to hdfs://10.46.205.55:9000/mnt/var/lib/hive_081/tmp/scratch/hive_2013-05-25_23-12-54_513_8781300101638774300/-ext-10000
MapReduce Jobs Launched: 
Job 0: Map: 21  Reduce: 13   Accumulative CPU: 142.11 sec   HDFS Read: 1225164183 HDFS Write:
2179 SUCCESS
Job 1: Map: 9   Accumulative CPU: 53.25 sec   HDFS Read: 686157231 HDFS Write: 694562725 SUCCESS
Job 2: Map: 18   Accumulative CPU: 92.26 sec   HDFS Read: 694650326 HDFS Write: 701922144
SUCCESS
Total MapReduce CPU Time Spent: 4 minutes 47 seconds 620 msec
OK
Time taken: 104.744 seconds
OK
Time taken: 0.013 seconds
Total MapReduce jobs = 3
Execution log at: /tmp/hadoop/hadoop_20130525231414_3cc50fdd-7e7a-4bcf-baab-b465804c6e49.log
2013-05-25 11:14:41	Starting to launch local task to process map join;	maximum memory = 932118528
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/hive-0.8.1/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2013-05-25 11:15:04	Processing rows:	200000	Hashtable size:	199999	Memory usage:	776903128
rate:	0.833
2013-05-25 11:15:09	Processing rows:	214614	Hashtable size:	214614	Memory usage:	811592328
rate:	0.871
2013-05-25 11:15:09	Dump the hashtable into file: file:/tmp/hadoop/hive_2013-05-25_23-14-39_270_9001068894438685980/-local-10006/HashTable-Stage-6/MapJoin-bu-21--.hashtable
Execution failed with exit status: 2
Obtaining error information

Task failed!
Task ID:
  Stage-10

Logs:#3212590 Rows loaded to hdfs://10.96.162.172:9000/mnt/var/lib/hive_081/tmp/scratch/hive_2013-05-25_22-48-32_821_7905326830391538266/-ext-10000
MapReduce Jobs Launched: 
Job 0: Map: 19  Reduce: 13   Accumulative CPU: 139.04 sec   HDFS Read: 1224727949 HDFS Write:
2179 SUCCESS
Job 1: Map: 9   Accumulative CPU: 53.03 sec   HDFS Read: 685799930 HDFS Write: 694158841 SUCCESS
Job 2: Map: 17   Accumulative CPU: 90.31 sec   HDFS Read: 694245117 HDFS Write: 701443582
SUCCESS
Total MapReduce CPU Time Spent: 4 minutes 42 seconds 380 msec
OK
Time taken: 105.029 seconds
OK
Time taken: 0.012 seconds
Total MapReduce jobs = 3
Execution log at: /tmp/hadoop/hadoop_20130525225050_e3192035-d205-4ba3-9ac1-442c072d725d.log
2013-05-25 10:50:20	Starting to launch local task to process map join;	maximum memory = 932118528
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/hive-0.8.1/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2013-05-25 10:50:47	Processing rows:	200000	Hashtable size:	199999	Memory usage:	775795904
rate:	0.832
2013-05-25 10:50:50	Processing rows:	212586	Hashtable size:	212586	Memory usage:	813714632
rate:	0.873
2013-05-25 10:50:50	Dump the hashtable into file: file:/tmp/hadoop/hive_2013-05-25_22-50-17_863_2904729913146511194/-local-10006/HashTable-Stage-6/MapJoin-bu-21--.hashtable
Execution failed with exit status: 2
Obtaining error information#4212271 Rows loaded to hdfs://10.4.26.233:9000/mnt/var/lib/hive_081/tmp/scratch/hive_2013-05-25_22-19-40_060_2510727650477507447/-ext-10000
MapReduce Jobs Launched: 
Job 0: Map: 21  Reduce: 13   Accumulative CPU: 142.28 sec   HDFS Read: 1224302437 HDFS Write:
2179 SUCCESS
Job 1: Map: 9   Accumulative CPU: 52.82 sec   HDFS Read: 685493907 HDFS Write: 693841827 SUCCESS
Job 2: Map: 17   Accumulative CPU: 90.52 sec   HDFS Read: 693926843 HDFS Write: 701115104
SUCCESS
Total MapReduce CPU Time Spent: 4 minutes 45 seconds 620 msec
OK
Time taken: 117.335 seconds
OK
Time taken: 0.011 seconds
Total MapReduce jobs = 3
Execution log at: /tmp/hadoop/hadoop_20130525222121_d91f5176-958e-4b10-896d-ffd427e1a12c.log
2013-05-25 10:21:39	Starting to launch local task to process map join;	maximum memory = 932118528
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/.versions/hive-0.8.1/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2013-05-25 10:21:57	Processing rows:	200000	Hashtable size:	199999	Memory usage:	782325272
rate:	0.839
2013-05-25 10:22:00	Processing rows:	212267	Hashtable size:	212267	Memory usage:	809524488
rate:	0.868
2013-05-25 10:22:00	Dump the hashtable into file: file:/tmp/hadoop/hive_2013-05-25_22-21-37_408_6569501641432754678/-local-10006/HashTable-Stage-6/MapJoin-bu-31--.hashtable
Execution failed with exit status: 2
Obtaining error information 		 	   		  
Mime
View raw message