hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shantian Purkad <shantian_pur...@yahoo.com>
Subject Skew Join Optimization in hive
Date Tue, 07 Jun 2011 19:31:17 GMT
Hi,

I have a query which joins 12 different tables (most of them left outer joins) and the query
takes almost 3 hours. 90% of the time is taken by a single reducer. One reducer is getting
bulk of the data to process.

How can I get around this and have fair distribution of data across all reducers? I tried
to enable the skewjoin optimization but getting below NPE after first step of the job is executed.

Any suggestions/ideas will be or great help.

Thanks,
Shantian

2011-06-07 19:22:28,923 Stage-11 map = 100%,  reduce = 85%
2011-06-07 19:22:30,932 Stage-11 map = 100%,  reduce = 100%
Ended Job = job_201106071542_0010
java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.plan.ConditionalResolverSkewJoin.getTasks(ConditionalResolverSkewJoin.java:97)
    at org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.ConditionalTask
hive> 
Mime
View raw message