hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <>
Subject Re: GenericUdf and Jdbc issues
Date Tue, 29 May 2012 13:57:54 GMT
this.getResourceAsStream(filename) is a very tricky method to get
right especially in hive which you have the hive-classpath, the
hadoop-classpath, the hive-jdbc classpath. Especially when you
consider that launched map/reduce tasks get there own environment and

I had the same issues when I was writing my geo-ip-udf. See the comments.

I came to the conclusion that if you add a file to the distributed
cache using 'ADD FILE'
You can reliably assume it will be in the current working directory
and this works.

        File f = new File(database);

I hope this helps.

On Tue, May 29, 2012 at 8:35 AM, Maoz Gelbart <> wrote:
> Hi all,
> I am using Hive 0.7.1 over Cloudera’s Hadoop distribution 0.20.2 and MapR
> hdfs distribution 1.1.1.
> I wrote a GenericUDF packaged as a Jar that attempts to open a local
> resource during its initialization at initialize(ObjectInspector[]
> arguments) command.
> When I run with the CLI, everything is fine.
> When I run using Cloudera’s Hive-JDBC driver, The UDF fails with null
> pointer returned from the command this.getResourceAsStream(filename).
> Removing the line fixed the problem and the UDF ran on both CLI and Jdbc, so
> I believe that “ADD JAR” and “CREATE TEMPORARY FUNCTION” were entered
> correctly.
> Did anyone observe such a behavior? I have a demo Jar to reproduce the
> problem if needed.
> Thanks,
> Maoz

View raw message