hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <donta...@gmail.com>
Subject Re: Anatomy of read in hdfs
Date Sat, 08 Apr 2017 07:19:06 GMT
Hi Sidhart,

When you read data from HDFS using a framework, like MapReduce, blocks of a
HDFS file are read in parallel by multiple mappers created in that
particular program. Input splits to be precise.

On the other hand if you have a standalone java program then it's just a
single thread process and will read the data sequentially.

On Friday, April 7, 2017, Sidharth Kumar <sidharthkumar2707@gmail.com>
wrote:

> Thanks for your response . But I dint understand yet,if you don't mind can
> you tell me what do you mean by "*With Hadoop, the idea is to parallelize
> the readers (one per block for the mapper) with processing framework like
> MapReduce.*"
>
> And also how the concept of parallelize the readers will work with hdfs
>
> Thanks a lot in advance for your help.
>
>
> Regards
> Sidharth
>
> On 07-Apr-2017 1:04 PM, "Philippe Kernévez" <pkernevez@octo.com
> <javascript:_e(%7B%7D,'cvml','pkernevez@octo.com');>> wrote:
>
> Hi Sidharth,
>
> The reads are sequential.
> With Hadoop, the idea is to parallelize the readers (one per block for the
> mapper) with processing framework like MapReduce.
>
> Regards,
> Philippe
>
>
> On Thu, Apr 6, 2017 at 9:55 PM, Sidharth Kumar <
> sidharthkumar2707@gmail.com
> <javascript:_e(%7B%7D,'cvml','sidharthkumar2707@gmail.com');>> wrote:
>
>> Hi Genies,
>>
>> I have a small doubt that hdfs read operation is parallel or sequential
>> process. Because from my understanding it should be parallel but if I read
>> "hadoop definitive guide 4" in anatomy of read it says "*Data is
>> streamed from the datanode back **to the client, which calls read()
>> repeatedly on the stream (step 4). When the end of the **block is
>> reached, DFSInputStream will close the connection to the datanode, then
>> find **the best datanode for the next block (step 5). This happens
>> transparently to the client, **which from its point of view is just
>> reading a continuous stream*."
>>
>> So can you kindly explain me how read operation will exactly happens.
>>
>>
>> Thanks for your help in advance
>>
>> Sidharth
>>
>>
>
>
> --
> Philippe Kernévez
>
>
>
> Directeur technique (Suisse),
> pkernevez@octo.com <javascript:_e(%7B%7D,'cvml','pkernevez@octo.com');>
> +41 79 888 33 32
>
> Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
> OCTO Technology http://www.octo.ch
>
>
>

-- 


[image: http://]

Tariq, Mohammad
about.me/mti
[image: http://]
<http://about.me/mti>

Mime
View raw message