hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markovitz, Dudu" <>
Subject RE: Beeline throws OOM on large input query
Date Sat, 03 Sep 2016 10:04:05 GMT
Hi Adam

I’m not clear about what you are trying to achieve in your query.
Can you please give a small example?



From: Adam []
Sent: Friday, September 02, 2016 4:13 PM
Subject: Re: Beeline throws OOM on large input query

I set the heap size using HADOOP_CLIENT_OPTS all the way to 16g and still no luck.

I tried to go down the table join route but the problem is that the relation is not an equality
so it would be a theta join which is not supported in Hive.
Basically what I am doing is a geographic intersection against 6,000 points so the where clause
has 6000 points in it (I use a custom UDF for the intersection).

To avoid the problem I ended up writing another version of the UDF that reads the point list
from an HDFS file.

It's a low priority I'm sure but I bet there are some inefficiencies in the query string handling
that could be fixed.  When I traced the code it was doing all kinds of StringBuffer and String
+= type stuff.

View raw message