hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam <work....@gmail.com>
Subject Re: Beeline throws OOM on large input query
Date Fri, 02 Sep 2016 13:12:31 GMT
I set the heap size using HADOOP_CLIENT_OPTS all the way to 16g and still
no luck.

I tried to go down the table join route but the problem is that the
relation is not an equality so it would be a theta join which is not
supported in Hive.
Basically what I am doing is a geographic intersection against 6,000 points
so the where clause has 6000 points in it (I use a custom UDF for the
intersection).

To avoid the problem I ended up writing another version of the UDF that
reads the point list from an HDFS file.

It's a low priority I'm sure but I bet there are some inefficiencies in the
query string handling that could be fixed.  When I traced the code it was
doing all kinds of StringBuffer and String += type stuff.

Regards,

Mime
View raw message