hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Faris <afa...@linkedin.com>
Subject Re: How bad is this? :)
Date Tue, 09 Jul 2013 15:50:36 GMT
Hi Chris,

You should use a utility like iozone "http://www.iozone.org/" for benchmarking drives while
tuning your filesystem.  You may be surprised at what measured values can show you. :)

We use ext4 for storing HDFS blocks on our compute nodes and journaling has been left on.
 We also have 'writeback' enabled and commits are delayed by 30 seconds.  Slide 21 has suggestions
for tuning ext4: "http://www.slideshare.net/allenwittenauer/2012-lihadoopperf"  Be warned
that with these settings and 3 copies of each block, it's still possible to lose data in the
event of a power loss.   ~2.5 years ago we had a datacenter power failure and I think lost
6-10 files due to block corruption.  Those files were actively being written when the power
failure happened so we ended up rerunning those jobs.  Balancing performance vs exposure is
something to keep in mind when making these kinds of changes.  

-- Adam

On Jul 9, 2013, at 12:25 AM, Harsh J <harsh@cloudera.com> wrote:

> This is what I remember: If you disable journalling, running fsck
> after a crash will (be required and) take longer. Certainly not a good
> idea to have an extra wait after the cluster loses power and is being
> restarted, etc.
> On Tue, Jul 9, 2013 at 7:42 AM, Chris Embree <cembree@gmail.com> wrote:
>> Hey Hadoop smart folks....
>> I have a tendency to seek optimum performance given my understanding, so
>> that led to me "brilliant" decision.  We settled on EXT4 for our underlying
>> FS for HDFS.   Greedy for speed I thought, let's turn the journal off and
>> gain the speed benefits.  After all, I have 3 copies of the data.
>> How much does this bother you, given we have a 21 node prod and only 10 node
>> dev cluster.
>> I'm embarrassed to say I did not capture good pre and post change I/O.  In
>> my simple brain, not writing to journal just screams improved I/O.
>> Don't be shy, tell me how badly I have done bad things. (I originally said
>> "screwed the pooch" but I reconsidered our > USA audience. ;)
>> If I'm not incredibly wrong, should we consider higher speed (less safe)
>> file systems?
>> Correct/support my thinking.
>> Chris
> --
> Harsh J

View raw message