hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jamal sasha <jamalsha...@gmail.com>
Subject executing hadoop commands from python?
Date Sat, 16 Feb 2013 22:47:44 GMT
Hi,

  This might be more of a python centric question but was wondering if
anyone has tried it out...

I am trying to run few hadoop commands from python program...

For example if from command line, you do:

      bin/hadoop dfs -ls /hdfs/query/path

it returns all the files in the hdfs query path..
So very similar to unix


Now I am trying to basically do this from python.. and do some manipulation
from it.

     exec_str = "path/to/hadoop/bin/hadoop dfs -ls " + query_path
     os.system(exec_str)

Now, I am trying to grab this output to do some manipulation in it.
For example.. count number of files?
I looked into subprocess module but then... these are not native shell
commands. hence not sure whether i can apply those concepts
How to solve this?

Thanks

Mime
View raw message