hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roberto Congiu <>
Subject Re: about User scripte in HiveQL
Date Tue, 01 Mar 2011 02:24:37 GMT
You have to add the file to the query like in the example

look at the part in red.

CREATE TABLE u_data_new (
  userid INT,
  movieid INT,
  rating INT,
  weekday INT)
add FILE;

  TRANSFORM (userid, movieid, rating, unixtime)
  USING 'python'
  AS (userid, movieid, rating, weekday)
FROM u_data;

SELECT weekday, COUNT(*)
FROM u_data_new
GROUP BY weekday;

2011/2/28 Jianhua Wang <>

> Hi all,
>      Recently, i have met a problem, and i can not solve it after some
> efforts. So I wanna look for help here, and any help will be appreciated.
> Thanks!
>      My case is depicted as below:
>      I want to execute the HiveQL command :
> select transform(a.col) using '/home/pc/' as (col string) from
> tmp_table a where a.col2='01';
> where the '' is a python script of mine.
> I have built a environment of hadoop within the vmware machine on my single
> node PC-home, and the command works well on this environment within only
> single node.
> I also have a cluster of three PC servers, including node A, B, and C.
> Then, I store the '/home/pc/' on node A.
> However, every time I issue the command to the cluster, i am always going
> to get the error information like this:
> -------------------------------------------------------------------------------------------------------------------
> Caused by: Cannot run program "/home/pc/":
> error=2, No such file or directory
>         at java.lang.ProcessBuilder.start(
>         at
> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(
>         ... 20 more
>    Caused by: error=2, No such
> file or directory
>         at java.lang.UNIXProcess.(
>         at java.lang.ProcessImpl.start(
>         at java.lang.ProcessBuilder.start(
>         ... 21 more
>  -------------------------------------------------------------------------------------------------------------------
> By looking up the Job logs, these errors were reported by node B and node
> C. It seems that the tasktracker B and C can not find the script.
> On hive wiki, I didn't find any instruction on how to place the user
> script.
> What should I do to place my script in proper place?
> Thanks in advance for any reply!
> 2011-03-01
> Jianhua Wang

Roberto Congiu -Data Engineer - OpenX
20 E Del Mar blvd, Pasadena, CA

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message