hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shrijeet Paliwal <shrij...@rocketfuel.com>
Subject Re: Runtime exceptions during meta scan
Date Wed, 14 Dec 2011 21:15:19 GMT
I will open the jira.

> Was there region splitting / transition at the time of this problem ? I
> would assume the NPE is related to region transitions.

I am not sure if that was happening. If it happens again, I will
check. But there was one more exception
ArrayIndexOutOfBoundsException, which I mentioned
http://pastie.org/2987927 . Wonder if region transition theory can
explain that as well.

On Wed, Dec 14, 2011 at 12:45 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> Shrijeet:
> When I remove the try/catch block, HCM compiles.
> Do you mind filing a JIRA for the issue so that other developers can
> comment ?
>
> Null check for regionInfo should be added.
>
> Was there region splitting / transition at the time of this problem ? I
> would assume the NPE is related to region transitions.
>
> Cheers
>
> On Wed, Dec 14, 2011 at 12:33 PM, Shrijeet Paliwal
> <shrijeet@rocketfuel.com>wrote:
>
>> > The following is preventing us from knowing where the NPE came from:>
>>        } catch (RuntimeException e) {>            throw new
>> IOException(e);>          }
>> Seems to me there is a scope of improving this block. I am trying to
>> understanding the reasoning behind catching the run time exception. If
>> we know that regioninfo can be null, may be a we can put a check and
>> throw a more meaningful error. What do you think?
>>
>> > I think you may even be able to reproduce the error by scanning .META.
>> > manually.
>> Hmm. You mean to say it was not a client problem, instead it was a
>> server problem? I must add other clients talking to server (ones whom
>> did not have JVM tunings I mentioned) did fine even during shitty
>> period seen by affected clients.
>> On Wed, Dec 14, 2011 at 12:10 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > The following is preventing us from knowing where the NPE came from:
>> >          } catch (RuntimeException e) {
>> >            throw new IOException(e);
>> >          }
>> > Most likely regionInfo was null.
>> >
>> > I think you may even be able to reproduce the error by scanning .META.
>> > manually.
>> >
>> > Cheers
>> >
>> > On Wed, Dec 14, 2011 at 11:28 AM, Shrijeet Paliwal
>> > <shrijeet@rocketfuel.com>wrote:
>> >
>> >> Here https://gist.github.com/1478070
>> >>
>> >> On Wed, Dec 14, 2011 at 11:03 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> >> > I was just saying that upgrading wouldn't incur any regression in your
>> >> > codebase.
>> >> > The major motiv is to make code matching easier.
>> >> >
>> >> > Or maybe you can publish the patched HCM.
>> >> >
>> >> > On Wed, Dec 14, 2011 at 10:59 AM, Shrijeet Paliwal
>> >> > <shrijeet@rocketfuel.com>wrote:
>> >> >
>> >> >> Hi Ted,
>> >> >> Thanks for replying.
>> >> >> Like I mentioned in the mail " Line numbers in stack trace may
not
>> >> >> match with 0.90.3 branch because of extra patches we have. "
>> >> >> We already have 4508 backported. Curious why you thought of that
>> issue?
>> >> >>
>> >> >> On Wed, Dec 14, 2011 at 10:56 AM, Ted Yu <yuzhihong@gmail.com>
>> wrote:
>> >> >> > Looking at the tip of 0.90, I didn't find the exact line of
code
>> where
>> >> >> NPE
>> >> >> > was thrown.
>> >> >> > 0.90.5RC0 is available and it contains HBASE-4508. Is it possible
>> to
>> >> >> > upgrade ?
>> >> >> > Cheers
>> >> >> >
>> >> >> > On Wed, Dec 14, 2011 at 10:07 AM, Shrijeet Paliwal
>> >> >> > <shrijeet@rocketfuel.com>wrote:
>> >> >> >
>> >> >> >> For what it is worth, the client was doing Full GC every
10th
>> second
>> >> >> >> while this was happening.
>> >> >> >> We recently increased new gen size on few of the clients
as a
>> part of
>> >> >> >> an experiment and all those clients suffer this situation
I
>> describe
>> >> >> >> in the mail earlier.
>> >> >> >>
>> >> >> >> On Thu, Dec 8, 2011 at 1:13 PM, Shrijeet Paliwal
>> >> >> >> <shrijeet@rocketfuel.com> wrote:
>> >> >> >> > Hi,
>> >> >> >> > Version: 0.90.3 + patches back ported
>> >> >> >> >
>> >> >> >> > The other day our client started spitting these two
runtime
>> >> >> exceptions.
>> >> >> >> Not
>> >> >> >> > all clients connected to the cluster were under impact.
Only 4
>> of
>> >> >> them.
>> >> >> >> > While 3 of them were throwing NPE, one of them was
>> >> >> >> > throwing ArrayIndexOutOfBoundsException. The errors
are :
>> >> >> >> >
>> >> >> >> > 1. http://pastie.org/2987926
>> >> >> >> > 2. http://pastie.org/2987927
>> >> >> >> >
>> >> >> >> > Clients did not recover from this and I had to bump
them.
>> >> >> >> >
>> >> >> >> > I wish to understand, since we are catching runtime
exception in
>> >> this
>> >> >> >> block
>> >> >> >> > of code - do we expect this kind of behavior. Also
with the
>> given
>> >> >> stack
>> >> >> >> > trace I can not tell which line caused NPE of AIOBE.
>> >> >> >> >
>> >> >> >> > Thanks.
>> >> >> >> >
>> >> >> >> > -Shrijeet
>> >> >> >> > PS: Line numbers in stack trace may not match with
0.90.3 branch
>> >> >> because
>> >> >> >> of
>> >> >> >> > extra patches we have.
>> >> >> >>
>> >> >>
>> >>
>>

Mime
View raw message