hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fan Bai <fb...@student.gsu.edu>
Subject [No Subject]
Date Sun, 24 Mar 2013 22:23:58 GMT

Dear Sir,

I have a question about Hadoop, when I use Hadoop and Mapreduce to finish a job (only one
job in here), can I control the file to work in which node?

For example, I have only one job and this job have 10 files (10 mapper need to run). Also
in my severs, I have one head node and four working node. My question is: can I control those
10 files to working in which node? Such as: No.1 file work in node1, No.3 file work in node2,
No.5 file work in node3 and No.8 file work in node4.

If I can do this, that means I can control the task. Is that means I still can control this
file in next around (I have a loop in head node;I can do another mapreduce work). For example,
I can set up No.5 file in 1st around worked node3 and I also can set up No.5 file work in
node 2 in 2nd around.

If I cannot, is that means, for Hadoop, the file will work in which node just like a “black
box”, the user cannot control the file will work in which node, because you think the user
do not need control it, just let HDFS help them to finish the parallel work. 
Therefore, the Hadoop cannot control the task in one job, but can control the multiple jobs.

Thank you so much!

Fan Bai
PhD Candidate
Computer Science Department
Georgia State University
Atlanta, GA 30303

View raw message