hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Adding a temporary function for thirft queries
Date Mon, 28 Mar 2011 14:57:49 GMT
On Mon, Mar 28, 2011 at 10:53 AM, Matthew Rathbone
<matthew@foursquare.com> wrote:
> Hey guys,
> I could really do with some expert-hive help on my issue, my hive-expertise
> are not all that great.
> I'm using hive 0.7 with hadoop 0.20
> A simple way to describe my problem is this:
> Using thrift, if you execute the following sequence:
> thrift.execute("ADD JAR /udf.jar");
> thrift.execute("create temporary function function1 as
> 'org.apache.test.Function' ")
> then the second execute doesn't see the jar.
> But if I try to string them together:
> thrift.execute("ADD JAR /udf.jar ; create temporary function function1 as
> 'org.apache.test.Function1' ")
> then hive throws errors:
> 11/03/28 14:51:07 INFO SessionState: Added resource:
> /mnt/var/lib/hive_07/downloaded_resources/udf.jar
> ; does not exist
> 11/03/28 14:51:07 ERROR SessionState: ; does not exist
> create does not exist
> 11/03/28 14:51:07 ERROR SessionState: create does not exist
> temporary does not exist
> 11/03/28 14:51:07 ERROR SessionState: temporary does not exist
> function does not exist
> 11/03/28 14:51:07 ERROR SessionState: function does not exist
>
>
> Does anyone have a suggestion on how to string these together (along with a
> select statement afterwards)
> Thanks for any help,
> Matthew
>
>
> On Thu, Mar 24, 2011 at 4:36 PM, Matthew Rathbone <matthew@foursquare.com>
> wrote:
>>
>> Hey all,
>> We use Amazon's elastic mapreduce and Hive 0.7 to run analytics queries,
>> and I'm having problems dynamically adding functions for use in the thrift
>> server.
>> I want to add a jar, add a function, then execute a query.
>> Using ruby as the example, I've tried:
>>       Hive.connect(@url, @port) do |connection|
>>         connection.execute(<ADD JAR and FUNCTION>)
>>         results = connection.fetch(query)
>>       end
>> but the function is not available between calls.
>> So I tried prepending the query with the function creation calls, but then
>> I don't get any data back from hive (simply an empty array).
>> Could someone direct me to the best way to add functions for thrift
>> queries? Honestly I'd rather add them permanently on startup, but I can't
>> find a way to do that either.
>
>
> --
> Matthew Rathbone
> Foursquare | Software Engineer | Server Engineering Team
> matthew@foursquare.com | @rathboma | 4sq
>

Traditionally 'add jar' would look for the jar file to be on the
thrift servers local file system not the client. I believe their is a
0.7.0 patch to load UDF jars from HDFS so this might help.

Mime
View raw message