hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Strategy Of Replica
Date Tue, 11 Oct 2011 11:32:28 GMT
On 11/10/11 04:49, gschen wrote:

> In hdfs only one thing we can do is that we could
> set replication factor to change replication strategy, but we can not
> change where the block is stored and what type of storage that we stored
> the data. Just think this case: In order to improve the downloading
> speed, I can choose my block replication near my location or near
> someone's location. I mean that users could have more option to decide
> their block replication strategy.

1. In "apache hadoop goes realtime at facebook", Dhruba and others 
discuss their use of alternate block placement policies.

2. Russ perry did some work on rasterization of PDF files in Hadoop 
where the final stage -collecting the output and streaming to the 
printer- was done on a machine next to the printer. He modified 
DFSClient to provide all the location data on all blocks, and had his 
app pick blocks off different machines to keep the net busy, avoid 
overloading any specific machine with disk IO requests, and to ensure 
peak bandwidth between the final destination machine

http://www.hpl.hp.com/techreports/2009/HPL-2009-345.pdf

Mime
View raw message