hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Einspanjer <deinspan...@mozilla.com>
Subject Re: How do people handle the OS disk partition on Hadoop or HBase nodes?
Date Thu, 30 Sep 2010 23:31:35 GMT
We had another configuration we used at first which had four disks with 
the first disk having an extra partition that was quite tiny for the OS.
Two annoying things there:
1. If we lost disk 1, we lost the whole box.
2. RHEL5 tends to put several GB worth of crap on the OS partition such 
as unused locale files and the yum cache can easily take up a few 
hundred MB.  This means that we are constantly cleaning up yum and 
deleting old log files and being generally space constrained on these boxes.


On 9/30/10 7:25 PM, Ryan Rawson wrote:
> What kind of raid are you doing?  Sounds like raid0, which means you
> have a 100% chance of losing the entire box if a single disk goes
> down.  If you choose just one, lets say sda, to host the OS you are
> now at 33% chance of losing the box if a disk goes bad - assuming that
> all disks have the same failure probability of course.
> What we do is install the OS on disk1, (sda), then have 4 JBODs and I
> put our logs on disk1 as well.  log4j is tricky because it will cause
> issues on disk corruption/io error events, but i have seen systems
> continue to operate even if log4j can't write to disk due to a disk
> full scenario.
> There is almost no non-HDFS data, you can literally wedge it in like
> 8gb.  The biggest things that are not HDFS data are logs, and those
> can go into the HDFS partition, they tend to be low volume but can add
> up over time since the default is not to reap them.
> On Thu, Sep 30, 2010 at 4:17 PM, Daniel Einspanjer
> <deinspanjer@mozilla.com>  wrote:
>> Right now, most of our boxes have 3 disk in them.  We take a small partition
>> on each of those and raid stripe them together to use as the OS partition
>> then allocate the rest of the disks as JBOD for HDFS storage.
>> We are building out a new cluster and I'm wondering if there are any better
>> ideas for balancing the need for storage and speed of the HDFS disks with
>> having *some place* to put the OS and non-HDFS data.
>> What are other people doing about that?
>> -Daniel

View raw message