hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From momina khan <momina.a...@gmail.com>
Subject Re: data locality on HDFS
Date Sat, 08 May 2010 03:16:34 GMT

i am still going in circles .... i still cant pin point a single
function call that interacts with the HDFS for block locations... it
is as if files are making circular calls to getBlockLocations() which
is implemented such that it calls the same function in a different
class ... i mean it is not talking to the HDFS anywhere.

plz help!

On 5/7/10, Amogh Vasekar <amogh@yahoo-inc.com> wrote:
> Hi,
> The (o.a.h.fs) FileSystem API has GetBlockLocations that is used to
> determine replicas.
> In general cases, (o.a.h.mapreduce.lib.input) FileInputFormat's getSplits()
> calls this method, which is passed on for job scheduling along with the
> split info.
> Hope this is what you were looking for.
> Amogh
> On 5/7/10 4:22 PM, "momina khan" <momina.azam@gmail.com> wrote:
> hi,
> i am trying to figure out how hadoop uses data locality to schedule maps on
> nodes which locally store tha map input ... going through code i am going in
> circles in between a couple of file but not really getting anywhere ... that
> is to say that i cant locate the HDFS API or func that can communicate a
> node list that store replicas foe say a block!
> i am going from FSNameSystem.java to DFSClient.java to
> BlocksWithLocations.java to DataNodeDescriptor.java and then back again
> without getting to the HDFS interface that communicates replicas' storing
> nodes for a block!
> someone plz help!
> momina

View raw message