hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Blom" <vaftrud...@gmail.com>
Subject Import path for hadoop streaming with python
Date Thu, 22 May 2008 23:39:43 GMT
Hello all,

I'm trying to stream a little python script on my small hadoop
cluster, and it doesn't work like I thought it would.

The script looks something like

#!/usr/bin/env python
import mylib

where mylib is a small python library that I want included, and I
launch the whole thing with something like

bin/hadoop jar contrib/streaming/hadoop-0.16.4-streaming.jar
-cacheFile "hdfs://master:54310/user/hadoop/mylib.py#mylib.py" -file
scrpit.py -mapper "script.py" -input input -output output

so it seems to me like the library should be available to the script.
When I run the script locally on my machine everything works perfectly
fine. However, when I run it it the script can't find the library.
Does hadoop do anything strange to default paths? Am I missing
something obvious? Any pointers or ideas on how to fix this would be

Martin Blom

View raw message