hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rakesh sharma <rakeshsharm...@hotmail.com>
Subject RE: Running python UDF in hive
Date Fri, 21 Aug 2015 13:16:31 GMT



Hi Friends
Thanks it worked now.I dont think we need to compose it to any functions as we are making
use of stdin read as given in examples over the ne

From: Ryan.Harris@zionsbancorp.com
To: user@hive.apache.org
Subject: RE: Running python UDF in hive
Date: Thu, 20 Aug 2015 19:24:24 +0000









remember that transform scripts in hive should receive data from STDIN and return results
to STDOUT.  So, to properly test  your transform script try this:
hive -e "select id from test limit 10" > testout.txt
cat testout.txt | python transform_value.py
 
if your transform script is working correctly, it will produce 10 lines of output.
 
If the above test works, but your query still fails, you should be sure to test to make sure
that the datanodes are properly configured to run your python
 script.
 


From: Manjee, Sunile [mailto:Sunile.Manjee@Teradata.com]


Sent: Thursday, August 20, 2015 8:28 AM

To: user@hive.apache.org

Subject: Re: Running python UDF in hive


 

Did you test your python script stand alone to verify it works as expected?


 


From:
rakesh sharma <rakeshsharma14@hotmail.com>

Reply-To: "user@hive.apache.org" <user@hive.apache.org>

Date: Thursday, August 20, 2015 at 7:55 AM

To: "user@hive.apache.org" <user@hive.apache.org>

Subject: Running python UDF in hive


 




Hi all


 


 


I am running a python UDF in hive. I am getting the following error.


 


 



hive> select transform(id) using 'python transform_value.py' as (id string) from test;


Query ID = 19659_20150820175050_ccb3b5e2-7e45-44a6-b16f-c4a4ad59e8f2


Total jobs = 1


Launching Job 1 out of 1


 


 


Status: Running (Executing on YARN cluster with App id application_1435674747354_0320)


 


--------------------------------------------------------------------------------


        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED


--------------------------------------------------------------------------------


Map 1                 FAILED      1          0        0        1       4       0


--------------------------------------------------------------------------------


VERTICES: 00/01  [>>--------------------------] 0%    ELAPSED TIME: 11.06 s


--------------------------------------------------------------------------------


Status: Failed


Vertex failed, vertexName=Map 1, vertexId=vertex_1435674747354_0320_1_00, diagnostics=[Task
failed, taskId=task_1435674747354_0320_1_00_000000, diagnostics=[TaskAttempt
 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException:
Hive Runtime Error while closing operators


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)


        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)


        at java.security.AccessController.doPrivileged(Native Method)


        at javax.security.auth.Subject.doAs(Subject.java:415)


        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)


        at java.util.concurrent.FutureTask.run(FutureTask.java:262)


        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


        at java.lang.Thread.run(Thread.java:745)


Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators


        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:333)


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)


        ... 13 more


Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred
when trying to close the Operator running your custom script.


        at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:550)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:309)


        ... 14 more


], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: Hive Runtime Error while closing operators


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)


        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)


        at java.security.AccessController.doPrivileged(Native Method)


        at javax.security.auth.Subject.doAs(Subject.java:415)


        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)


        at java.util.concurrent.FutureTask.run(FutureTask.java:262)


        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


        at java.lang.Thread.run(Thread.java:745)


Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators


        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:333)


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)


        ... 13 more


Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred
when trying to close the Operator running your custom script.


        at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:550)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:309)


        ... 14 more


], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: Hive Runtime Error while closing operators


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)


        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)


        at java.security.AccessController.doPrivileged(Native Method)


        at javax.security.auth.Subject.doAs(Subject.java:415)


        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)


        at java.util.concurrent.FutureTask.run(FutureTask.java:262)


        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


        at java.lang.Thread.run(Thread.java:745)


Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators


        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:333)


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)


        ... 13 more


Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred
when trying to close the Operator running your custom script.


        at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:550)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:309)


        ... 14 more


], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: Hive Runtime Error while closing operators


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)


        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)


        at java.security.AccessController.doPrivileged(Native Method)


        at javax.security.auth.Subject.doAs(Subject.java:415)


        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)


        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)


        at java.util.concurrent.FutureTask.run(FutureTask.java:262)


        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


        at java.lang.Thread.run(Thread.java:745)


Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators


        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:333)


        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)


        ... 13 more


Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred
when trying to close the Operator running your custom script.


        at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:550)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)


        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:309)


        ... 14 more


]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1435674747354_0320_1_00
[Map 1] killed/failed due to:null]


DAG failed due to vertex failure. failedVertices:1 killedVertices:0


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask


hive> [Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException:
Hive Runtime Error while closing operators


    >



 


I am writing a simple example, with  a table with one column of int type and the python script
increments it. 


Can some one point to the type of error here so that I can go and debug, I am clueless.






THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain
information that is privileged and exempt from disclosure under applicable law. If you are
neither the intended recipient nor responsible for delivering the message to the intended
recipient, please note that any dissemination, distribution, copying or the taking of any
action in reliance upon the message is strictly prohibited. If you have received this communication
in error, please notify the sender immediately.  Thank you.

 		 	   		  
Mime
View raw message