Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B146D10CA2 for ; Wed, 5 Mar 2014 08:09:29 +0000 (UTC) Received: (qmail 32850 invoked by uid 500); 5 Mar 2014 08:09:21 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 32762 invoked by uid 500); 5 Mar 2014 08:09:20 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 32755 invoked by uid 99); 5 Mar 2014 08:09:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Mar 2014 08:09:19 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of radhe.krishna.radhe@live.com designates 65.55.90.96 as permitted sender) Received: from [65.55.90.96] (HELO snt0-omc2-s21.snt0.hotmail.com) (65.55.90.96) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Mar 2014 08:09:12 +0000 Received: from SNT153-W26 ([65.55.90.72]) by snt0-omc2-s21.snt0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 5 Mar 2014 00:08:52 -0800 X-TMN: [5xazR+oNQT5Xmjd5I2kDYxRjDiW9EBH9] X-Originating-Email: [radhe.krishna.radhe@live.com] Message-ID: Content-Type: multipart/alternative; boundary="_bf795403-5eec-4fcb-8cd0-ee76c5953f81_" From: Radhe Radhe To: "user@hadoop.apache.org" Subject: Streaming data access in HDFS: Design Feature Date: Wed, 5 Mar 2014 13:38:52 +0530 Importance: Normal MIME-Version: 1.0 X-OriginalArrivalTime: 05 Mar 2014 08:08:52.0562 (UTC) FILETIME=[258AC320:01CF384A] X-Virus-Checked: Checked by ClamAV on apache.org --_bf795403-5eec-4fcb-8cd0-ee76c5953f81_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hello All=2C Can anyone please explain what we mean by Streaming data access in HDFS. Data is usually copied to HDFS and in HDFS the data is splitted across Data= Nodes in blocks. Say for example=2C I have an input file of 10240 MB(10 GB) in size and a bl= ock size of 64 MB. Then there will be 160 blocks. These blocks will be distributed across DataNodes in blocks. Now the Mappers will read data from these DataNodes keeping the data locali= ty feature in mind(i.e. blocks local to a DataNode will be read by the map = tasks running in that DataNode). Can you please point me where is the "Streaming data access in HDFS" is com= ing into picture here? Thanks=2C RR = --_bf795403-5eec-4fcb-8cd0-ee76c5953f81_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hello All=2C

Can anyone p= lease explain what we mean by Streaming data access in HDFS.

= Data is usually copied to HDFS and in HDFS the data is splitted across Data= Nodes in blocks.
Say for example=2C I have an input file of 10240 MB(10 = GB) in size and a block size of 64 MB. Then there will be 160 blocks.
Th= ese blocks will be distributed across DataNodes in blocks.
Now the Mappe= rs will read data from these DataNodes keeping the data locality feature= in mind(i.e. blocks local to a DataNode will be read by the map tasks = running in that DataNode).

Can you please point me where is the "Str= eaming data access in HDFS" is coming into picture here?

Thanks=2CRR
= --_bf795403-5eec-4fcb-8cd0-ee76c5953f81_--