Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4F44410CEF for ; Wed, 5 Mar 2014 08:18:06 +0000 (UTC) Received: (qmail 53174 invoked by uid 500); 5 Mar 2014 08:17:58 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 52542 invoked by uid 500); 5 Mar 2014 08:17:56 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 52534 invoked by uid 99); 5 Mar 2014 08:17:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Mar 2014 08:17:55 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_REMOTE_IMAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dwivedishashwat@gmail.com designates 209.85.215.42 as permitted sender) Received: from [209.85.215.42] (HELO mail-la0-f42.google.com) (209.85.215.42) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Mar 2014 08:17:51 +0000 Received: by mail-la0-f42.google.com with SMTP id ec20so445490lab.15 for ; Wed, 05 Mar 2014 00:17:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=EHmhTPzAyeYxgxyvNrbpqVFj23Niiy0O74pdt+VB2RY=; b=OO/KoaFpqccTXlqe6hQrCPiBj+O/NiYmdp+Y5nmOO5l7MkdhHyiip9wRhRnQj4K5Pu L25y4h43P6pg/Tq4K74zrQOYYR+t3ig2jN5nqfUEagHDcdQsxOYmtwOSN7i3rN9ajQnk 7ClDYQwlxWkP5jy+eQxKVI/IkS7STgqCHcdq91MdDHM+pDJzkVTk50IJ57opaGZDGs/M pRXJ6qf/8X0j5n2kIPqv+DdFnSRmL70KsMOaPuxSgOBvfpYyS8EjJr5sWP/QbO5DwXMZ pdB4qNXGjiDatoQnGmMWkpxh8fEew+dlhe1R8SCB+zZgxczt142r/ZnD6Ed3V55SnN4X uDJg== X-Received: by 10.152.87.14 with SMTP id t14mr430290laz.52.1394007449697; Wed, 05 Mar 2014 00:17:29 -0800 (PST) MIME-Version: 1.0 Received: by 10.114.75.137 with HTTP; Wed, 5 Mar 2014 00:17:09 -0800 (PST) In-Reply-To: References: From: shashwat shriparv Date: Wed, 5 Mar 2014 13:47:09 +0530 Message-ID: Subject: Re: Streaming data access in HDFS: Design Feature To: user Cc: radhe.krishna.radhe@live.com Content-Type: multipart/alternative; boundary=001a11c34f22da7fe104f3d7a627 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c34f22da7fe104f3d7a627 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Streaming means process it as its coming to HDFS, like where in hadoop this hadoop streaming enable hadoop to receive data using executable of different types i hope you have already read this : http://hadoop.apache.org/docs/r0.18.1/streaming.html#Hadoop+Streaming *Warm Regards_**=E2=88=9E_* * Shashwat Shriparv* [image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9][image: https://twitter.com/shriparv] [image: https://www.facebook.com/shriparv] [imag= e: http://google.com/+ShashwatShriparv] [image: http://www.youtube.com/user/sShriparv/videos][image: http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] On Wed, Mar 5, 2014 at 1:38 PM, Radhe Radhe w= rote: > Hello All, > > Can anyone please explain what we mean by *Streaming data access in HDFS*= . > > Data is usually copied to HDFS and in HDFS the data is splitted across > DataNodes in blocks. > Say for example, I have an input file of 10240 MB(10 GB) in size and a > block size of 64 MB. Then there will be 160 blocks. > These blocks will be distributed across DataNodes in blocks. > Now the Mappers will read data from these DataNodes keeping the *data > locality feature* in mind(i.e. blocks local to a DataNode will be read by > the map tasks running in that DataNode). > > Can you please point me where is the "Streaming data access in HDFS" is > coming into picture here? > > Thanks, > RR > --001a11c34f22da7fe104f3d7a627 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Streaming means process it as= its coming to HDFS, like where in hadoop this hadoop streaming enable hado= op to receive data using executable of different types=C2=A0



Warm Regards_
=E2=88=9E= _
= Shashwat Shrip= arv
<= font size=3D"4">
3D"http://www.linke=3D"ht=3D"https://www.f=3D"http://google.c=3D"http://www.you=<= font size=3D"4">3D"http://profil== <= /span>= <= /span>


On Wed, Mar 5, 2014 at 1:38 PM, Radhe Ra= dhe <radhe.krishna.radhe@live.com> wrote:
Hello All,

Can anyone please explain what we m= ean by Streaming data access in HDFS.

Data is usually copied = to HDFS and in HDFS the data is splitted across DataNodes in blocks.
Say for example, I have an input file of 10240 MB(10 GB) in size and a bloc= k size of 64 MB. Then there will be 160 blocks.
These blocks will be dis= tributed across DataNodes in blocks.
Now the Mappers will read data from= these DataNodes keeping the data locality feature in mind(i.e. bloc= ks local to a DataNode will be read by the map tasks running in that DataNo= de).

Can you please point me where is the "Streaming data access in HDF= S" is coming into picture here?

Thanks,
RR

--001a11c34f22da7fe104f3d7a627--