Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B6CB9BBA for ; Mon, 20 May 2013 20:56:33 +0000 (UTC) Received: (qmail 19930 invoked by uid 500); 20 May 2013 20:56:28 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 19854 invoked by uid 500); 20 May 2013 20:56:28 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 19845 invoked by uid 99); 20 May 2013 20:56:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 May 2013 20:56:28 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.139.212.180] (HELO nm21.bullet.mail.bf1.yahoo.com) (98.139.212.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 May 2013 20:56:21 +0000 Received: from [98.139.212.151] by nm21.bullet.mail.bf1.yahoo.com with NNFMP; 20 May 2013 20:56:00 -0000 Received: from [98.139.215.253] by tm8.bullet.mail.bf1.yahoo.com with NNFMP; 20 May 2013 20:56:00 -0000 Received: from [127.0.0.1] by omp1066.mail.bf1.yahoo.com with NNFMP; 20 May 2013 20:56:00 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 35116.55954.bm@omp1066.mail.bf1.yahoo.com Received: (qmail 86643 invoked by uid 60001); 20 May 2013 20:55:59 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1369083359; bh=K8pcCZZGBMsCfN0nVOvU/CmPiS7EmqNWpuA9AJDbMqc=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=s/4Bsh9BaVSFFksRv75m3Di6Y9sUsxh37tiaXOIR78g1n8GxJzad9CHjD4Wk2NwDQU7RCjyTvHWOxW1FjDz2dOePaRFH6lnz0fsBaJ6tCZPCc9hYqhUAojFldeeNGcW016f0uhvD/M8vFgb04So4OWQAqO9UIkDCQs2qLH6hmz0= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=mODCoWV49PkzFeF/So4/9tvH6TjfiWbXoC0k2O93cSz5y/SRC0elREE9bQgn0M16pCeeoFt9qn06dezrMMI4I1Gfh3mJpM8MR7rhXZfQfrj1eIzmruEHR/WCO0+cytJhK/xSXtpmQfOG3cL/047SVwOAo5MlwJ6bQFrvxFoEGjs=; X-YMail-OSG: dPV28hAVM1nveoxWO5nXEmVug2Tp0PQ5cROUQlhwMHjuXM2 DB1JGLgwMFqJP6HDWFYERHJFgOJ7Gn1NiNyrxnk9FzjkFw4jki.6OGo8CHVx kU7D7kUAqtkha.aejyWXGOkkfhLYFt9_e8i55Dy82YzqsrihOp1y5Zl2Se22 _qO6dt_yb1gRakksH3My4P_0ow2TWHHpVHNa8mG8Xc3HzM0L_0X7yaQNbtbg yMKMFlTisg8GV4sKrBQMbwXN5EzJShqJ2UiriW.e021YmNJwEyK_Sz56o671 BdIGmiMua3tvHAOopPDXSAoRTeE8a7v0TbaTsU37jQEgkioyUAsiclWn5qfV UZgPdUJHKdpK3QxbX9L0.v_D6bhxsqUUw4s35C6xUcpDqY0T9fyWkbD2Z0OB uqsnZDUP69Zw8uzDnPgJswl3iuim5LqNyKQLyojis8opbAe2E7GxcLIVINbK CY5cE1YT8W_n5Y5TmA8cqjyr24Q-- Received: from [71.196.53.15] by web162206.mail.bf1.yahoo.com via HTTP; Mon, 20 May 2013 13:55:59 PDT X-Rocket-MIMEInfo: 002.001,SGkgQ2hyaXMsCgpUaGFua3MgZm9yIHRoZSBleHBsYWluYXRpb24uCgpSZWdhcmRzLApSYWoKCgoKCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCiBGcm9tOiBDaHJpcyBFbWJyZWUgPGNlbWJyZWVAZ21haWwuY29tPgpUbzogdXNlckBoYWRvb3AuYXBhY2hlLm9yZzsgUmFqIEhhZG9vcCA8aGFkb29wcmFqQHlhaG9vLmNvbT4gClNlbnQ6IE1vbmRheSwgTWF5IDIwLCAyMDEzIDE6NTEgUE0KU3ViamVjdDogUmU6IExvdyBsYXRlbmN5IGRhdGEgYWNjZXNzIFZzIEhpZ2ggdGhyb3VnaHB1dCBvZiBkYXQBMAEBAQE- X-Mailer: YahooMailWebService/0.8.142.542 References: <1369061312.42343.YahooMailNeo@web162201.mail.bf1.yahoo.com> Message-ID: <1369083359.78367.YahooMailNeo@web162206.mail.bf1.yahoo.com> Date: Mon, 20 May 2013 13:55:59 -0700 (PDT) From: Raj Hadoop Reply-To: Raj Hadoop Subject: Re: Low latency data access Vs High throughput of data To: "user@hadoop.apache.org" , "chris@embree.us" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1120943518-255276437-1369083359=:78367" X-Virus-Checked: Checked by ClamAV on apache.org --1120943518-255276437-1369083359=:78367 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Hi Chris,=0A=0AThanks for the explaination.=0A=0ARegards,=0ARaj=0A=0A=0A=0A= =0A________________________________=0A From: Chris Embree =0ATo: user@hadoop.apache.org; Raj Hadoop =0ASent: = Monday, May 20, 2013 1:51 PM=0ASubject: Re: Low latency data access Vs High= throughput of data=0A =0A=0A=0AI'll take a swing at this one.=0A=0ALow lat= ency data access: =A0I hit the enter key (or submit button) and I expect re= sults within seconds at most. =A0My database query time should be sub-secon= d.=0AHigh throughput of data: =A0I want to scan millions of rows of data an= d count or sum some subset. =A0I expect this will take a few minutes (or mu= ch longer depending on complexity) to complete. =A0Think of more batch styl= e jobs.=0A=0ACaveats: This is really a map/reduce issue also. =A0The Set up= and processing of M/R jobs takes a bit of overhead. =A0There are a couple = of projects working now to move toward lower latency data access.=0A=0AAlso= , HDFS stores data in blocks and distributes them across many nodes. =A0Thi= s means that there will (almost) always be some network data transfer requi= red to get the final answer, and that "slows" things down a bit, depending = on throughput and various other factors.=0A=0AHope that helps. :)=0A=0A=0A= =0AOn Mon, May 20, 2013 at 10:48 AM, Raj Hadoop wrote= :=0A=0AHi,=0A>=0A>=0A>I have a basic question on HDFS. I was reading that H= DFS doesnt work well with =0Alow latency data access. Rather it is designed= for the high throughput =0Aof data. Can you please explain in simple words= the difference between =0A"Low latency data access Vs High throughput of d= ata".=0A>=0A>=0A>=0A>Thanks,=0A>Raj --1120943518-255276437-1369083359=:78367 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable
Hi Chris,

Than= ks for the explaination.

Regards,
Raj



= From: Chris Embree <cembree@gmail.com>
To: user@hadoop.apache.org; Raj Hadoop &= lt;hadoopraj@yahoo.com>
Sent:= Monday, May 20, 2013 1:51 PM
Subject: Re: Low latency data access Vs High throughput = of data

=0A
=
I'll take a swing at this one.

Low latency data access:  I hit the enter key (or submit button) an= d I expect results within seconds at most.  My database query time sho= uld be sub-second.
=0A
High throughput of data:  I= want to scan millions of rows of data and count or sum some subset.  = I expect this will take a few minutes (or much longer depending on complexi= ty) to complete.  Think of more batch style jobs.
=0A

Caveats: This is really a map/reduce issue = also.  The Set up and processing of M/R jobs takes a bit of overhead. =  There are a couple of projects working now to move toward lower laten= cy data access.
=0A

Also, HDF= S stores data in blocks and distributes them across many nodes.  This = means that there will (almost) always be some network data transfer require= d to get the final answer, and that "slows" things down a bit, depending on= throughput and various other factors.
=0A

Hope that helps. :)


On Mon, May 20, 201= 3 at 10:48 AM, Raj Hadoop <hadoopraj@yahoo.com> wrote:
=0A
=0A
Hi,=

I=0A have a basic question on HDFS. I was reading that HDFS doesnt work we= ll with =0Alow latency data access. Rather it is designed for the high thro= ughput =0Aof data. Can you please explain in simple words the difference be= tween =0A"Low latency data access Vs High throughput of data".

=0AThanks,
Raj

=0A


--1120943518-255276437-1369083359=:78367--