hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Rathbone <matt...@foursquare.com>
Subject Re: Adding a temporary function for thirft queries
Date Mon, 28 Mar 2011 15:17:35 GMT
Hey, thanks for the response.

I have the jar on the thrift server's local file system (its the same
machine as is running hive) and it's this path I pass to the add jar
command.
If I tail the logs I can see that the ADD JAR command is successful (when
loading from local fs), but the subsequent execution of the create function
statement still doesn't see the class:

Added /mnt/var/lib/hive_07/downloaded_resources/udf.jar to class path
11/03/28 15:14:10 INFO exec.FunctionTask: create function:
java.lang.ClassNotFoundException: com.example.udf.Function1

Do you know if the state gets reset between executes?

On Mon, Mar 28, 2011 at 10:57 AM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> On Mon, Mar 28, 2011 at 10:53 AM, Matthew Rathbone
> <matthew@foursquare.com> wrote:
> > Hey guys,
> > I could really do with some expert-hive help on my issue, my
> hive-expertise
> > are not all that great.
> > I'm using hive 0.7 with hadoop 0.20
> > A simple way to describe my problem is this:
> > Using thrift, if you execute the following sequence:
> > thrift.execute("ADD JAR /udf.jar");
> > thrift.execute("create temporary function function1 as
> > 'org.apache.test.Function' ")
> > then the second execute doesn't see the jar.
> > But if I try to string them together:
> > thrift.execute("ADD JAR /udf.jar ; create temporary function function1 as
> > 'org.apache.test.Function1' ")
> > then hive throws errors:
> > 11/03/28 14:51:07 INFO SessionState: Added resource:
> > /mnt/var/lib/hive_07/downloaded_resources/udf.jar
> > ; does not exist
> > 11/03/28 14:51:07 ERROR SessionState: ; does not exist
> > create does not exist
> > 11/03/28 14:51:07 ERROR SessionState: create does not exist
> > temporary does not exist
> > 11/03/28 14:51:07 ERROR SessionState: temporary does not exist
> > function does not exist
> > 11/03/28 14:51:07 ERROR SessionState: function does not exist
> >
> >
> > Does anyone have a suggestion on how to string these together (along with
> a
> > select statement afterwards)
> > Thanks for any help,
> > Matthew
> >
> >
> > On Thu, Mar 24, 2011 at 4:36 PM, Matthew Rathbone <
> matthew@foursquare.com>
> > wrote:
> >>
> >> Hey all,
> >> We use Amazon's elastic mapreduce and Hive 0.7 to run analytics queries,
> >> and I'm having problems dynamically adding functions for use in the
> thrift
> >> server.
> >> I want to add a jar, add a function, then execute a query.
> >> Using ruby as the example, I've tried:
> >>       Hive.connect(@url, @port) do |connection|
> >>         connection.execute(<ADD JAR and FUNCTION>)
> >>         results = connection.fetch(query)
> >>       end
> >> but the function is not available between calls.
> >> So I tried prepending the query with the function creation calls, but
> then
> >> I don't get any data back from hive (simply an empty array).
> >> Could someone direct me to the best way to add functions for thrift
> >> queries? Honestly I'd rather add them permanently on startup, but I
> can't
> >> find a way to do that either.
> >
> >
> > --
> > Matthew Rathbone
> > Foursquare | Software Engineer | Server Engineering Team
> > matthew@foursquare.com | @rathboma | 4sq
> >
>
> Traditionally 'add jar' would look for the jar file to be on the
> thrift servers local file system not the client. I believe their is a
> 0.7.0 patch to load UDF jars from HDFS so this might help.
>



-- 
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matthew@foursquare.com | @rathboma <http://twitter.com/rathboma> |
4sq<http://foursquare.com/rathboma>

Mime
View raw message