hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From diogo <>
Subject Multiple joins cause failures in Reduce phase
Date Thu, 10 Jul 2014 22:37:06 GMT
 So, I have a query like this:

ud_name.value as name
ud_age.value as age
from user
left outer join user_data ud_name on = ud_name.user_id and
ud_name.key = 'name'
left outer join user_data ud_age on = ud_age.user_id and ud_age.key
= 'age'

With multiple joins over user_data, basically to flatten user_data into one
row per user. I have 9 left outer joins and I'm getting reducer errors like

Error: java.lang.RuntimeException: Hive Runtime Error while closing
operators: null at
org.apache.hadoop.hive.ql.exec.ExecReducer.close( at at
org.apache.hadoop.mapred.ReduceTask.runOldReducer( at at
org.apache.hadoop.mapred.YarnChild$ at Method) at at
at org.apache.hadoop.mapred.YarnChild.main( Caused by:
java.lang.NullPointerException at
at org.apache.hadoop.hive.ql.exec.ExecReducer.close(
... 8 more

The weird thing is: if I remove one of the joins, regardless of which one,
the query runs fine. So, 8 joins seem to be the magic number. Has anyone
ever seen something like this?

View raw message