hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bogdan Ghidireac <bog...@ecstend.com>
Subject Re: zookeeper connection hangs during shutdown
Date Tue, 05 Apr 2011 06:31:25 GMT
Please see my answers inline ...

On Mon, Apr 4, 2011 at 8:45 PM, Stack <stack@duboce.net> wrote:
> On Mon, Apr 4, 2011 at 2:30 AM, Bogdan Ghidireac <bogdan@ecstend.com> wrote:
>> Is is possible to add a timeout and then force a System.exit() ?
> Yes. Of course.  Sounds bad.  How you think this scenario came about?

My M/R job reads from a table and creates a lot of data that is
inserted into a second table. Because this new table is empty and I
did not split the keys in advance, the region server where the first
region was created is hit really hard (60-100K ops/sec).

The OOM exception happens during this time, only for one or maybe two
servers. The exception triggers a server shutdown...
Once the initial region splits and the traffic is distributed, the
problem does not happen any more.

> Is the zk ensemble up and running still?

The ZK ensemble is running fine. I have 3 zk servers running ZK 3.3.2.

> Whats the last thing in this regionserver log?

This is the RS log

> Anything in the .out file?

This is the System.out/err
I http://pastebin.com/gNNVUzvZ

> I've not seen this
> before but, hey, the world is a wide and wonderful place.  We could
> run the zk close inside a thread and interrupt if it goes on too long
> (Let me ask the zk boys if they've seen this before too).

I am subscribed to ZK list too and I have seen you email. I am using
ZK 3.3.2 ...

> St.Ack

Thank you,

View raw message