hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <>
Subject Re: Optimizing UDF
Date Wed, 15 Jul 2015 01:26:59 GMT

> I am already using Tez (sorry, forgot to mention this), and my goal is
>indeed to build the instance once per container.

Put a log line in your UDF init() and check if it is being called multiple
times per container. If you¹re loading the data everytime, then that might
be something to fix.

The other aspect is that there¹s GC pauses that can happen due to that and
such extraneous reasons for the slow-down.

But first, look at how many times you are loading the distributed cache
data per container.


View raw message