hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <gop...@apache.org>
Subject Re: Beeline throws OOM on large input query
Date Tue, 06 Sep 2016 16:50:28 GMT

>  1) confirm your beeline java process is indeed running with expanded memory

The OOM error is clearly coming from the HiveServer2 CBO codepath post beeline.

        at org.apache.calcite.rel.AbstractRelNode$1.explain_(AbstractRelNode.java:409)
        at org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:157)
        at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:308)
        at org.apache.calcite.rel.AbstractRelNode.computeDigest(AbstractRelNode.java:416)
        at org.apache.calcite.rel.AbstractRelNode.recomputeDigest(AbstractRelNode.java:352)
        at org.apache.calcite.plan.hep.HepPlanner.buildFinalPlan(HepPlanner.java:881)
        at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:199)

In the 1.x branch, this would've failed for completely different reasons - the OR expressions
were left leaning, so the expression a or b or c or d would be parsed as (((a or b) or c)
or d) - instead being a balanced tree like (a or b) or (c or d).

The difference was between having LOG2(N) depth vs N.

>  I didn't bother though it would be good information since I found a work around and
troubleshooting beeline wasn't my primary goal :)

The CBO error is likely to show up anyway - this might be a scenario where your HiveServer2
has been started up with a 2Gb heap and keeps dying while logging the query plans.

Loading the points off HDFS is pretty much the ideal solution, particularly if pre-compute
an ST_Envelope for the small table side while loading (like an R-Tree) to reduce the total
number of actual intersection checks for complex polygons.

Cheers,
Gopal







Mime
View raw message