hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Siddharth Tiwari <siddharth.tiw...@live.com>
Subject RE: Query hangs at 99.97 % for one reducer in Hive
Date Mon, 03 Mar 2014 04:48:36 GMT
Hi Sanjay,here is teh detail
Even 500 reducers sounds a high number but I don't know the deatils of your cluster. Can u
provide some details
How many nodes in cluster : 21 Nodes
Hive version : Hive-0.10.x
Which distribution (Hortonworks, Apache, CDH, Amazon): CDH 4.3
Node specs: Each node 64 cores ( with HT), 128 GB RAM, 3*10 TB disks
Partitions in the table: none 
Number of records: around 17 Billion
Also,I see following exception thrown by hanging task, I have no idea what does this mean
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing row (tag=1) {"key":{"joinkey0":""},"value":{"_col2":"92","_col11":"-60-01-21,00","_col12":"-03-07-04,00"},"alias":1}
	at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:270)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing
row (tag=1) {"key":{"joinkey0":""},"value":{"_col2":"92","_col11":"-60-01-21,00","_col12":"-03-07-04,00"},"alias":1}
	at org.apache.hadoop.hive.ql.exec.ExecRedu
*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 

"Maybe other people will try to limit me but I don't limit myself"


From: siddharth.tiwari@live.com
To: user@hadoop.apache.org
Subject: FW: Query hangs at 99.97 % for one reducer in Hive
Date: Sun, 2 Mar 2014 23:12:47 +0000




Forwarding message to hadoop list as well for any help. Appreciate any help

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 

"Maybe other people will try to limit me but I don't limit myself"


From: siddharth.tiwari@live.com
To: user@hive.apache.org
Subject: Query hangs at 99.97 % for one reducer in Hive
Date: Sun, 2 Mar 2014 23:09:25 +0000




Hi team,
following query hangs at 99.97% for one reducer, kindly help or point to what can be cause
drop table if exists sample.dpi_short_lt;create table sample.dpi_short_lt asselect       
           b.msisdn,                a.area_erb,                a.longitude,              
 a.latitude,                               substring(b.msisdn,1,2) as country,           
                   substring(b.msisdn,3,2) as area_code,                               substring(b.start_time,1,4)
as year,                               substring(b.start_time,6,2) as month,             
                 substring(b.start_time,9,2) as day,                               substring(b.start_time,12,2)
as hour,                               cast(b.procedure_duration as double) as duracao_ms,
                              cast(b.internet_latency as double) as int_internet_latency,
                              cast(b.ran_latency as double) as int_ran_latency,          
                    cast(b.http_latency as double) as int_http_latency,                  
            (case when b.internet_latency='' then 1 else 0 end) as internet_latency_missing,
                              (case when b.ran_latency='' then 1 else 0 end) as ran_latency_missing,
                              (case when b.http_latency='' then 1 else 0 end) as http_latency_missing,
                              (cast(b.mean_throughput_ul as int) * cast( procedure_duration
as int) / 1000) as total_up_bytes,                               (cast(b.mean_throughput_dl
as int) * cast(procedure_duration as int)  / 1000) as total_dl_bytes,                    
          cast(b.missing_packets_ul as int) as int_missing_packets_ul,                   
           cast(b.missing_packets_dl as int) as int_missing_packets_dlfrom sample.dpi_large
bleft outer join sample.science_new aon b.cgi = regexp_replace(a.codigo_cgi_ecgi,'-','')where
msisdn!='';
Hive was heuristically selecting 1000 reducers and it was hanging at 99.97 percent on one
reduce task. I then changed the above values to 3GB per reducer and 500 reducers and started
hitting this error.
java.lang.RuntimeException: Hive Runtime Error while closing operators: Unable to rename output
from: hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812_8390586541316719852-1/_task_tmp.-ext-10001/_tmp.000003_0
to: hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812_8390586541316719852-1/_tmp.-ext-10001/000003_0
	at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:313)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:516)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output from:
hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812

I have 22 node cluster running cdh 4.3. Please try to locate what can be teh issue.

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 

"Maybe other people will try to limit me but I don't limit myself"
 		 	   		   		 	   		   		 	   		  
Mime
View raw message