hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Why single thread for HDFS?
Date Wed, 07 Jul 2010 09:45:32 GMT
elton sky wrote:
> Steve,
> 
> Seems HP has done block based parallel reading from different datanodes.

yes; very much like IBM's GPFS, only with JBOD storage and the option of 
running code near the data when appropriate.

> Though not from disk level, they achieve 4Gb/s rate with 9 readers (500Mb/s
> each).
> I didn't see anywhere I can download their code to play around, pity~


I do have access to that code if I can get at the right bit of the 
repository, if you really want me to look at it in detail ask, with the 
caveats that I'm away for the rest of the month and somewhat busy. Apart 
from that there's no reason why I shouldn't be able to make the changes 
to DfsClient public. Keep reminding me :)


> BTW, can we specify which disk to read from with Java?
> 

I think right now you get a list of blocks via 
DfsClient.getBlockLocations(); this is a list of hosts where blocks 
live. There is no data about which disk on the specific host.

I belive that what Russ did was move the decisions from DfsInputStream 
-which picks a block location for you, with a bias to the local host- 
and instead lets the calling program make the decision as to where to 
fetch each block. This meant he could set the renderer up to request 
blocks from different hosts.

He had tried to use the JT to schedule the rendering code, but that 
didn't work as MapReduce has the notion of "reduction": less data out 
than in, so it moves work to where the data is. In rendering it's more 
MapExpand; the operation is the transformation of PDF pages into 600dpi 
32bpp bitmaps, which then need to be streamed to the (very large) 
printer at its print rate, in the correct order. It was easiest to have 
a specific machine on the cluster -with no datanodes or TTs- set up to 
do the rendering, and just ask the filesystem for where things are.

Like I said, I don't think there was anything tricky done in DfsClient, 
more a matter of making some data known internally to the  DfsClient 
code public, so that the client app can decide where to fetch data. If 
the DfsClient knew which HDD the data was on in a datanode, the client 
app could use that in its decision making too, so that if the 9 machines 
each had 6 HDDs, you could keep them all busy.

Mime
View raw message