hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Re: Running on multiple CPU's
Date Mon, 16 Apr 2007 21:42:06 GMT
>Ken Krugler wrote:
>>Has anybody been using Hadoop with ZFS? Would ZFS count as a 
>>readily available shared file system that scales appropriately?
>
>Sun's ZFS?  I don't think that's distributed, is it?  Does it 
>provide a single namespace across an arbitrarily large cluster? 
>>From the documentation I can find it just sounds like a better 
>single-node filesystem.  It'd be good for, e.g., mounting 40 1TB 
>drives on a big Sun box, but I don't see how it's meant to, e.g., 
>stitch together 4,000 drives across a cluster of 1,000 nodes into a 
>single filesystem.

I'd seem references to using ZFS as a "poor man's cluster", e.g. 
http://blogs.sun.com/erickustarz/entry/poor_man_s_cluster_end and 
http://www.opensolaris.org/jive/message.jspa?messageID=22182#22182.

 From reading them, it's clear the ZFS isn't a distributed file 
system, not does it cleanly support shared access...though people are 
hacking on it to achieve some of these goals.

But for Hadoop users who don't have the requirement to access 4K 
drives on 1K servers (which would be, oh, maybe 99.9% of the universe 
:)) it might be an interesting option for a high performance, high 
reliability FS that scales further than NFS.

Having said that, we don't use Solaris (everything is Linux-based). 
There's a port in motion to Linux, from what I've read, so it might 
become more interesting then.

-- Ken
-- 
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"

Mime
View raw message