hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Compactions nice to have features
Date Thu, 09 Oct 2014 05:20:10 GMT
Hi Michael,

your math is right.

I think the issue is that it actually is easy to max out the ToR switch (and hence starve
out other traffic), so we might want to protect the ToR switch from prolonged heavy compaction
traffic in order to keep some of the bandwidth free for other traffic.
Vladimir issues were around slowing other traffic while compactions are running.

-- Lars

----- Original Message -----
From: Michael Segel <michael_segel@hotmail.com>
To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
Cc: Vladimir Rodionov <vladrodionov@gmail.com>
Sent: Wednesday, October 8, 2014 12:30 PM
Subject: Re: Compactions nice to have features

On Oct 5, 2014, at 11:01 PM, lars hofhansl <larsh@apache.org> wrote:

>>> - rack IO throttle. We should add that to accommodate for over subscription at
the ToR level.
>> Can you decipher that, Lars?
> ToR is "Top of Rack" switch. Over subscription means that a ToR switch usually does not
have enough bandwidth to serve traffic in and out of rack at full speed.
> For example if you had 40 machines in a rack with 1ge links each, and the ToR switch
has a 10ge uplink, you'd say the ToR switch is 4 to 1 over subsctribed.
> Was just trying to say: "Yeah, we need that" :)


Rough math…  using 3.5” SATA II (7200 RPM) drives … 4 drives would max out 1GbE.  So
then  a server with 12 drives would max out 3Gb/S. Assuming 3.5” drives. 2.5” drives and
SATAIII would push this up. 
So in theory you could get 5Gb/S or more from a node. 

16 serves per rack… (again YMMV based on power, heat, etc … ) thats 48Gb/S and up. 

If you had 20 servers and they had smaller (2.5” drives) 5Gb/S x 20 = 100Gb/S. 

So what’s the width of the fabric?  (YMMV based on ToR) 

I don’t know why you’d want to ‘throttle’ because the limits of the ToR would throttle
you already. 

Of course I’m assuming that you’re running a M/R job that’s going full bore. 

Are you seeing this? 
I would imagine that you’d have a long running job maxing out the I/O and seeing a jump
in wait CPU over time.  

And what’s the core to spindle ratio?

View raw message