hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From arun kumar <arunkumar_sk...@yahoo.com>
Subject Re: Question on job scheduling
Date Wed, 24 Feb 2010 14:27:51 GMT
Thanks for the reply. I am currently trying to move all the blocks and its replicas, of all
the input files only, to a specified location. That is, just before job start-up check for
the input files' location and move its corresponding blocks and replicas to the desired/highly
efficient data nodes, there by making sure only these nodes execute the job (I am assuming
this because, I believe each block will be operated upon by the nearest available mapping
process only).

And in your reply you had mentioned that some of the work should be initiated from the client,
is it the JobClient class you are talking about?


--- On Fri, 2/19/10, Wang Xu <gnawux@gmail.com> wrote:

From: Wang Xu <gnawux@gmail.com>
Subject: Re: Question on job scheduling
To: common-dev@hadoop.apache.org
Date: Friday, February 19, 2010, 7:25 AM

On Thu, Feb 18, 2010 at 12:00 AM, arun kumar <arunkumar_skcet@yahoo.com> wrote:
> My questions are:
> 1. Will such a change improve the performance? Considering the overhead caused by moving
the data blocks.

In some special case, it might improve the performance, but it depends
on your application.

> 2. I believe I will have to start from the NameNode to move the blocks. If anyone can
give me a brief explanation on the process to implement this or even sources to find information
on this it would be very helpful.

I think some of the work might initiate from client. Could you
describe what you want to do in detail?
 1 do you want to specify datanode to store special blocks, or only
want some blocks are located together?
 2 do you want to specify the location of all the replicas of a block,
or only want to specify one of the replicas.

Wang Xu
Stephen LeacockĀ  - "I detest life-insurance agents: they always argue
that I shall some day die, which is not so." -

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message