hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: 3-Hour Periodic Network/CPU/Disk/Latency Spikes
Date Fri, 13 Dec 2013 23:33:05 GMT
Patrick:
Attachment didn't go through. 

Cheers

On Dec 13, 2013, at 3:18 PM, Patrick Schless <patrick.schless@gmail.com> wrote:

> Very interesting, I think we may be on to something. I grabbed all the timestamps for
major compactions completing and put them on a graph (see attached). Each horizontal line
is an individual server, and the dots are when compactions complete. Each server clearly has
a cluster of compactions about every 3 hours, and several of the servers are aligned such
that they are compacting at the same time.
> 
> Should we be managing these compactions ourselves? Would it make more sense to have them
less frequently (but presumably more expensive), or closer together?
> 
> Thanks,
> Patrick
> 
> 
> On Fri, Dec 13, 2013 at 2:19 PM, Bryan Beaudreault <bbeaudreault@hubspot.com> wrote:
>> Have you taken a look at the logs on the RegionServers during the period?
>> 
>> One possibility is compactions happening organically.  If you were
>> sustaining a certain level of writes most of the time, I could maybe see
>> that every 3 hours enough store files build up to require compactions.
>> 
>> There's nothing else automated in HDFS or HBase that I could see causing
>> this.
>> 
>> On Fri, Dec 13, 2013 at 3:07 PM, Patrick Schless
>> <patrick.schless@gmail.com>wrote:
>> 
>> > CDH4.1.2
>> > HBase 0.92.1
>> > HDFS 2.0.0
>> >
>> >
>> > Every 3 hours, our production HBase cluster does something that causes all
>> > the data nodes to have a sustained spike in CPU/network/disk. The spike
>> > lasts about 30 mins, and during this time the cluster has greatly increased
>> > latencies for our typical application usage.
>> >
>> > I can't find anything in our application that would have such a periodic
>> > and significant behavior. Is there anything that HBase/HDFS might be doing
>> > on it's own that would cause this? We're on the default schedule for major
>> > compactions, but I thought that was daily.
>> >
>> > Any ideas what could be causing this?
>> >
>> > Thanks,
>> >
>> > Patrick
>> >
> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message