hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sidharth Kumar <sidharthkumar2...@gmail.com>
Subject Re: Anatomy of read in hdfs
Date Fri, 07 Apr 2017 17:45:54 GMT
Thanks for your response . But I dint understand yet,if you don't mind can
you tell me what do you mean by "*With Hadoop, the idea is to parallelize
the readers (one per block for the mapper) with processing framework like
MapReduce.*"

And also how the concept of parallelize the readers will work with hdfs

Thanks a lot in advance for your help.


Regards
Sidharth

On 07-Apr-2017 1:04 PM, "Philippe Kernévez" <pkernevez@octo.com> wrote:

Hi Sidharth,

The reads are sequential.
With Hadoop, the idea is to parallelize the readers (one per block for the
mapper) with processing framework like MapReduce.

Regards,
Philippe


On Thu, Apr 6, 2017 at 9:55 PM, Sidharth Kumar <sidharthkumar2707@gmail.com>
wrote:

> Hi Genies,
>
> I have a small doubt that hdfs read operation is parallel or sequential
> process. Because from my understanding it should be parallel but if I read
> "hadoop definitive guide 4" in anatomy of read it says "*Data is streamed
> from the datanode back **to the client, which calls read() repeatedly on
> the stream (step 4). When the end of the **block is reached,
> DFSInputStream will close the connection to the datanode, then find **the
> best datanode for the next block (step 5). This happens transparently to
> the client, **which from its point of view is just reading a continuous
> stream*."
>
> So can you kindly explain me how read operation will exactly happens.
>
>
> Thanks for your help in advance
>
> Sidharth
>
>


-- 
Philippe Kernévez



Directeur technique (Suisse),
pkernevez@octo.com
+41 79 888 33 32

Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
OCTO Technology http://www.octo.ch

Mime
View raw message