hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zesheng Wu <wuzeshen...@gmail.com>
Subject Re: HDFS File Writes & Reads
Date Tue, 17 Jun 2014 14:07:48 GMT
1. HDFS doesn't allow parallel write
2. HDFS use pipeline to write multiple replicas, so it doesn't take three
times more time than a traditional file write
3. HDFS allow parallel read


2014-06-17 19:17 GMT+08:00 Vijaya Narayana Reddy Bhoomi Reddy <
vijay.bhoomireddy@gmail.com>:

> Hi,
>
> I have a basic question regarding file writes and reads in HDFS. Is the
> file write and read process a sequential activity or executed in parallel?
>
> For example, lets assume that there is a File File1 which constitutes of
> three blocks B1, B2 and B3.
>
> 1. Will the write process write B2 only after B1 is complete and B3 only
> after B2 is complete or for a large file with many blocks, can this happen
> in parallel? In all the hadoop documentation, I read this to be a
> sequential operation. Does that mean for a file of 1TB, it takes three
> times more time than a traditional file write? (due to default replication
> factor of 3)
> 2. Is it similar in the case of read as well?
>
> Kindly someone please provide some clarity on this...
>
> Regards
> Vijay
>



-- 
Best Wishes!

Yours, Zesheng

Mime
View raw message