hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vidhyashankar Venkataraman <vidhy...@yahoo-inc.com>
Subject Re: Unresponsive master in Hbase 0.90.0
Date Mon, 31 Jan 2011 17:54:52 GMT
The Hbase cluster doesn't have the master problems with hadoop-append turned on: we will try
finding out why it wasn't working with a non-append version of hadoop (with a previous version
of hadoop, it was getting stuck while splitting logs).

But there are other issues now (with append turned on) which we are trying to resolve. The
region server that's hosting the META region is getting choked after a table was loaded with
around 100 regions per server (this is likely the target load that we wanted to have and this
worked in 0.89 with the same number of nodes and Hbase 0.90 worked fine with 40 nodes and
that's why I started straight with this number). The node can be pinged, but not accessible
through ssh and I am unable to perform most hbase operations on the cluster as a result.

   Can the RS hosting META be a potential bottleneck in the system at all? (I will try shutting
down that particular node and see what happens).

Vidhya


On 1/28/11 3:49 PM, "Vidhyashankar Venkataraman" <vidhyash@yahoo-inc.com> wrote:

64 bit Java 1.6.

Why is the master even trying to issue a split with an empty log/region in hand? ( private
List<Path> splitLog(final FileStatus[] logfiles)  )

V

On 1/28/11 3:06 PM, "Todd Lipcon" <todd@cloudera.com> wrote:

The 16000 second sleep is really strange... never seen anything like it.

What JVM are you running?

-Todd

On Fri, Jan 28, 2011 at 11:29 AM, Stack <stack@duboce.net> wrote:

> On Fri, Jan 28, 2011 at 11:23 AM, Vidhyashankar Venkataraman
> <vidhyash@yahoo-inc.com> wrote:
> > We are working on trying to fix this (cc'ed Adam as well).
> >
> >>> Hmm.. maybe before you restart remove the directory
> >>> hdfs://b3110120.yst.yahoo.net:4600/hbase/.logs/ completely so no files
> >>> to be processed on restart.
> >
> > This one, I had tried during one of the attempts: and it created new logs
> directory and still hung at some point which I think was the same point. (I
> will have to dig in to see what exactly happened).
> >
> > We havent yet looked at that part of the code, but why is the master even
> trying to issue a split with an empty log/region in hand?
> >
>
> Can you tar up one of these regionserver dirs and put it somewhere I
> can pull?  I'll try it over here.
> St.Ack
>



--
Todd Lipcon
Software Engineer, Cloudera



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message