hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Dere <jd...@hortonworks.com>
Subject Re: CREATE FUNCTION: How to automatically load extra jar file?
Date Fri, 02 Jan 2015 20:12:38 GMT
The point of USING JAR as part of the CREATE FUNCTION statement to try to avoid having to do
ADD JAR/aux path stuff to get the UDF to work. 

Are all of these commands (Step 1-5) from the same Hive CLI prompt?

>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using
JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
to class path
>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> OK


One note, /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
here should actually be on the local file system, not on HDFS where you were checking in Step
5. During CREATE FUNCTION/query compilation, Hive will make a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar),
copied to a temp location on the local file system where it's used by that Hive session.

The location mentioned in the FileNotFoundException (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)
has a different path than the local copy mentioned during CREATE FUNCTION (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar).
I'm not really sure why it is a HDFS path here either, but I'm not too familiar with what
goes on during the job submission process. But the fact that this HDFS path has the same naming
convention as the directory used for downloading resources locally (***_resources) looks a
little fishy to me. Would you be able to check if such a file exists with the same path, on
the local file system?





On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <nirmal.kumar@impetus.co.in> wrote:

>   Important: HiveQL's ADD JAR operation does not work with HiveServer2 and the Beeline
client when Beeline runs on a different host. As an alterntive to ADD JAR, Hive auxiliary
path functionality should be used as described below.
> 
> Refer:
> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html​
> 
> 
> Thanks,
> -Nirmal
> 
> From: Arthur.hk.chan@gmail.com <arthur.hk.chan@gmail.com>
> Sent: Tuesday, December 30, 2014 9:54 PM
> To: vic0777
> Cc: Arthur.hk.chan@gmail.com; user@hive.apache.org
> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>  
> Thank you.
> 
> Will this work for hiveserver2 ?
> 
> 
> Arthur
> 
> On 30 Dec, 2014, at 2:24 pm, vic0777 <vic0777@163.com> wrote:
> 
>> 
>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. Then,
the file is automatically loaded when Hive is started.
>> 
>> Wantao
>> 
>> 
>> 
>> 
>> At 2014-12-30 11:01:06, "Arthur.hk.chan@gmail.com" <arthur.hk.chan@gmail.com>
wrote:
>> Hi,
>> 
>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR
file to hive for UDF, below are my steps to create the UDF function. I have tried the following
but still no luck to get thru.
>> 
>> Please help!!
>> 
>> Regards
>> Arthur
>> 
>> 
>> Step 1:   (make sure the jar in in HDFS)
>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> -rw-r--r--   3 hadoop hadoop      57388 2014-12-30 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> 
>> Step 2: (drop if function exists) 
>> hive> drop function sysdate;                                                 

>> OK
>> Time taken: 0.013 seconds
>> 
>> Step 3: (create function using the jar in HDFS)
>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using
JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>> Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
to class path
>> Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> OK
>> Time taken: 0.034 seconds
>> 
>> Step 4: (test)
>> hive> select sysdate();                                                      
                                                                         
>> Automatically selecting local only mode for query
>> Total jobs = 1
>> Launching Job 1 out of 1
>> Number of reduce tasks is set to 0 since there's no reduce operator
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
attempt to override final parameter: yarn.nodemanager.loacl-dirs;  Ignoring.
>> 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
>> Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>> java.io.FileNotFoundException: File does not exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>> at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>> at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>> at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>> Job Submission failed with exception 'java.io.FileNotFoundException(File does not
exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>> Execution failed with exit status: 1
>> Obtaining error information
>> Task failed!
>> Task ID:
>>   Stage-1
>> Logs:
>> /tmp/hadoop/hive.log
>> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> 
>> 
>> Step 5: (check the file)
>> hive> dfs -ls /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>> ls: `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar':
No such file or directory
>> Command failed with exit code = 1
>> Query returned non-zero code: 1, cause: null
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> 
> 
> 
> 
> NOTE: This message may contain information that is confidential, proprietary, privileged
or otherwise protected by law. The message is intended solely for the named addressee. If
received in error, please destroy and notify the sender. Any use of this email is prohibited
when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity
of this communication has been maintained nor that the communication is free of errors, virus,
interception or interference.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message