hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)
Date Thu, 16 Jan 2014 21:08:35 GMT
I will rerun tests on our perf cluster (20 nodes 32CPU each , 96GB). It has
0.94.CDH4.3 and will let you know results. The preliminary run revealed
similar issues - I need to make sure that our config and cluster set up is
correct before I kick off the real 500M rows test.


On Thu, Jan 16, 2014 at 12:58 PM, lars hofhansl <larsh@apache.org> wrote:

> In any case, though, I would not expect HBase to have any issue with that,
> unless there are some server issues at the HDFS layer.
>
> @Vladimir, what happens when you run HDFS' DFSIO?
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Bryan Beaudreault <bbeaudreault@hubspot.com>
> To: dev@hbase.apache.org
> Cc:
> Sent: Thursday, January 16, 2014 10:33 AM
> Subject: Re: HBase 0.94.15: writes stalls periodically even under moderate
> steady load (AWS EC2)
>
> This might be better on the user list? Anyway..
>
> How many IPC handlers are you giving?  m1.xlarge is very low cpu.  Not only
> does it have only 4 cores (more cores allow more concurrent threads with
> less context switching), but those cores are severely underpowered.  I
> would recommend at least c1.xlarge, which is only a bit more expensive.  If
> you happen to be doing heavy GC, with 1-2 compactions running, and with
> many writes incoming, you are quickly using up quite a bit of CPU.  What is
> the load and CPU usage, on the 10.38.106.234:50010?
>
> Did you see anything about blocking updates in the hbase logs?  How much
> memstore are you giving?
>
>
>
> On Thu, Jan 16, 2014 at 1:17 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > On Wed, Jan 15, 2014 at 5:32 PM,
> > Vladimir Rodionov <vladrodionov@gmail.com> wrote:
> >
> > > Yes, I am using ephemeral (local) storage. I found that iostat is most
> of
> > > the time idle on 3K load with periodic bursts up to 10% iowait.
> > >
> >
> > Ok, sounds like the problem is higher up the stack.
> >
> > I see in later emails on this thread a log snippet that shows an issue
> with
> > the WAL writer pipeline, one of the datanodes is slow, sick, or partially
> > unreachable. If you have uneven point to point ping times among your
> > cluster instances, or periodic loss, it might still be AWS's fault,
> > otherwise I wonder why the DFSClient says a datanode is sick.
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message