hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe Kernévez <pkerne...@octo.com>
Subject Re: Anatomy of read in hdfs
Date Fri, 07 Apr 2017 07:33:52 GMT
Hi Sidharth,

The reads are sequential.
With Hadoop, the idea is to parallelize the readers (one per block for the
mapper) with processing framework like MapReduce.

Regards,
Philippe


On Thu, Apr 6, 2017 at 9:55 PM, Sidharth Kumar <sidharthkumar2707@gmail.com>
wrote:

> Hi Genies,
>
> I have a small doubt that hdfs read operation is parallel or sequential
> process. Because from my understanding it should be parallel but if I read
> "hadoop definitive guide 4" in anatomy of read it says "*Data is streamed
> from the datanode back **to the client, which calls read() repeatedly on
> the stream (step 4). When the end of the **block is reached,
> DFSInputStream will close the connection to the datanode, then find **the
> best datanode for the next block (step 5). This happens transparently to
> the client, **which from its point of view is just reading a continuous
> stream*."
>
> So can you kindly explain me how read operation will exactly happens.
>
>
> Thanks for your help in advance
>
> Sidharth
>
>


-- 
Philippe Kernévez



Directeur technique (Suisse),
pkernevez@octo.com
+41 79 888 33 32

Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
OCTO Technology http://www.octo.ch

Mime
View raw message