hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dejan Menges <dejan.men...@gmail.com>
Subject Re: Supervisely, RAID0 provides best io performance whereas no RAID the worst
Date Mon, 01 Aug 2016 09:47:18 GMT
Hi Shady,

We did extensive tests on this and received fix from Hortonworks which we
are probably first and only to test most likely tomorrow evening. If
Hortonworks guys are reading this maybe they know official HDFS ticket ID
for this, if there is such, as I can not find it in our correspondence.
Long story short - single server had RAID controllers with 1G and 2G cache
(both scenarios were tested). It started just as a simple benchmark test
using TestDFSIO after trying to narrow down best configuration on server
side (discussions like this one, JBOD, RAID0, benchmarking etc). However,
having 10-12 disks in a single server, and mentioned controllers, we got
6-10 times higher write speed when not using replication (meaning using
replication factor one). Took really months to narrow it down to single
hardcoded value in HdfsConstants.DEFAULT_DATA_SOCKET_SIZE (just looking
into patch). In the
end tcpPeerServer.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE)
basically limited write speed to this constant when using replication,
which is super annoying (specially in the context where more or less
everyone is using now network speed bigger than 100Mbps). This can be found
in b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java

On Mon, Aug 1, 2016 at 11:39 AM Shady Xu <shadyxu@gmail.com> wrote:

> Thanks Allen. I am aware of the fact you said and am wondering what's the
> await and svctm on your cluster nodes. If there are no signifiant
> difference, maybe I should try other ways to tune my HBase.
> And Dejan, I've never heard of or noticed what you said. If that's true
> it's really disappointing and please notice us if there's any progress.
> 2016-08-01 15:33 GMT+08:00 Dejan Menges <dejan.menges@gmail.com>:
>> Sorry for jumping in, but hence performance... it took as a while to
>> figure out why, whatever disk/RAID0 performance you have, when it comes to
>> HDFS and replication factor bigger then zero, disk write speed drops to
>> 100Mbps... After long long tests with Hortonworks they found that issue is
>> that someone at some point in history hardcoded stuff somewhere, and
>> whatever setup you have, you were limited to this. Luckily we have quite
>> powerful testing environment and plan is to test this patch later this
>> week. I'm not sure if there's either official HDFS bug for this, checked
>> our internal history but didn't see anything like that.
>> This was quite disappointing, as whatever tuning, controllers, setups you
>> do, it goes down the water with this.
>> On Mon, Aug 1, 2016 at 8:30 AM Allen Wittenauer <aw@apache.org> wrote:
>>> On 2016-07-30 20:12 (-0700), Shady Xu <shadyxu@gmail.com> wrote:
>>> > Thanks Andrew, I know about the disk failure risk and that it's one of
>>> the
>>> > reasons why we should use JBOD. But JBOD provides worse performance
>>> than
>>> > RAID 0.
>>> It's not about failure: it's about speed.  RAID0 performance will drop
>>> like a rock if any one disk in the set is slow. When all the drives are
>>> performing at peak, yes, it's definitely faster.  But over time, drive
>>> speed will decline (sometimes to half speed or less!) usually prior to a
>>> failure. This failure may take a while, so in the mean time your cluster is
>>> getting slower ... and slower ... and slower ...
>>> As a result, JBOD will be significantly faster over the _lifetime_ of
>>> the disks vs. a comparison made _today_.
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: user-help@hadoop.apache.org

View raw message