hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shrijeet Paliwal <shrij...@rocketfuel.com>
Subject Re: Runtime exceptions during meta scan
Date Thu, 15 Dec 2011 03:25:52 GMT
Created https://issues.apache.org/jira/browse/HBASE-5035

On Wed, Dec 14, 2011 at 1:17 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> I am not sure.
> If you patch your build with the upcoming patch, we should be able to get
> more information.
>
> Thanks Shrijeet.
>
> On Wed, Dec 14, 2011 at 1:15 PM, Shrijeet Paliwal
> <shrijeet@rocketfuel.com>wrote:
>
>> I will open the jira.
>>
>> > Was there region splitting / transition at the time of this problem ? I
>> > would assume the NPE is related to region transitions.
>>
>> I am not sure if that was happening. If it happens again, I will
>> check. But there was one more exception
>> ArrayIndexOutOfBoundsException, which I mentioned
>> http://pastie.org/2987927 . Wonder if region transition theory can
>> explain that as well.
>>
>> On Wed, Dec 14, 2011 at 12:45 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > Shrijeet:
>> > When I remove the try/catch block, HCM compiles.
>> > Do you mind filing a JIRA for the issue so that other developers can
>> > comment ?
>> >
>> > Null check for regionInfo should be added.
>> >
>> > Was there region splitting / transition at the time of this problem ? I
>> > would assume the NPE is related to region transitions.
>> >
>> > Cheers
>> >
>> > On Wed, Dec 14, 2011 at 12:33 PM, Shrijeet Paliwal
>> > <shrijeet@rocketfuel.com>wrote:
>> >
>> >> > The following is preventing us from knowing where the NPE came from:>
>> >>        } catch (RuntimeException e) {>            throw new
>> >> IOException(e);>          }
>> >> Seems to me there is a scope of improving this block. I am trying to
>> >> understanding the reasoning behind catching the run time exception. If
>> >> we know that regioninfo can be null, may be a we can put a check and
>> >> throw a more meaningful error. What do you think?
>> >>
>> >> > I think you may even be able to reproduce the error by scanning .META.
>> >> > manually.
>> >> Hmm. You mean to say it was not a client problem, instead it was a
>> >> server problem? I must add other clients talking to server (ones whom
>> >> did not have JVM tunings I mentioned) did fine even during shitty
>> >> period seen by affected clients.
>> >> On Wed, Dec 14, 2011 at 12:10 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> >> > The following is preventing us from knowing where the NPE came from:
>> >> >          } catch (RuntimeException e) {
>> >> >            throw new IOException(e);
>> >> >          }
>> >> > Most likely regionInfo was null.
>> >> >
>> >> > I think you may even be able to reproduce the error by scanning .META.
>> >> > manually.
>> >> >
>> >> > Cheers
>> >> >
>> >> > On Wed, Dec 14, 2011 at 11:28 AM, Shrijeet Paliwal
>> >> > <shrijeet@rocketfuel.com>wrote:
>> >> >
>> >> >> Here https://gist.github.com/1478070
>> >> >>
>> >> >> On Wed, Dec 14, 2011 at 11:03 AM, Ted Yu <yuzhihong@gmail.com>
>> wrote:
>> >> >> > I was just saying that upgrading wouldn't incur any regression
in
>> your
>> >> >> > codebase.
>> >> >> > The major motiv is to make code matching easier.
>> >> >> >
>> >> >> > Or maybe you can publish the patched HCM.
>> >> >> >
>> >> >> > On Wed, Dec 14, 2011 at 10:59 AM, Shrijeet Paliwal
>> >> >> > <shrijeet@rocketfuel.com>wrote:
>> >> >> >
>> >> >> >> Hi Ted,
>> >> >> >> Thanks for replying.
>> >> >> >> Like I mentioned in the mail " Line numbers in stack trace
may not
>> >> >> >> match with 0.90.3 branch because of extra patches we have.
"
>> >> >> >> We already have 4508 backported. Curious why you thought
of that
>> >> issue?
>> >> >> >>
>> >> >> >> On Wed, Dec 14, 2011 at 10:56 AM, Ted Yu <yuzhihong@gmail.com>
>> >> wrote:
>> >> >> >> > Looking at the tip of 0.90, I didn't find the exact
line of code
>> >> where
>> >> >> >> NPE
>> >> >> >> > was thrown.
>> >> >> >> > 0.90.5RC0 is available and it contains HBASE-4508.
Is it
>> possible
>> >> to
>> >> >> >> > upgrade ?
>> >> >> >> > Cheers
>> >> >> >> >
>> >> >> >> > On Wed, Dec 14, 2011 at 10:07 AM, Shrijeet Paliwal
>> >> >> >> > <shrijeet@rocketfuel.com>wrote:
>> >> >> >> >
>> >> >> >> >> For what it is worth, the client was doing Full
GC every 10th
>> >> second
>> >> >> >> >> while this was happening.
>> >> >> >> >> We recently increased new gen size on few of
the clients as a
>> >> part of
>> >> >> >> >> an experiment and all those clients suffer this
situation I
>> >> describe
>> >> >> >> >> in the mail earlier.
>> >> >> >> >>
>> >> >> >> >> On Thu, Dec 8, 2011 at 1:13 PM, Shrijeet Paliwal
>> >> >> >> >> <shrijeet@rocketfuel.com> wrote:
>> >> >> >> >> > Hi,
>> >> >> >> >> > Version: 0.90.3 + patches back ported
>> >> >> >> >> >
>> >> >> >> >> > The other day our client started spitting
these two runtime
>> >> >> >> exceptions.
>> >> >> >> >> Not
>> >> >> >> >> > all clients connected to the cluster were
under impact. Only
>> 4
>> >> of
>> >> >> >> them.
>> >> >> >> >> > While 3 of them were throwing NPE, one of
them was
>> >> >> >> >> > throwing ArrayIndexOutOfBoundsException.
The errors are :
>> >> >> >> >> >
>> >> >> >> >> > 1. http://pastie.org/2987926
>> >> >> >> >> > 2. http://pastie.org/2987927
>> >> >> >> >> >
>> >> >> >> >> > Clients did not recover from this and I
had to bump them.
>> >> >> >> >> >
>> >> >> >> >> > I wish to understand, since we are catching
runtime
>> exception in
>> >> >> this
>> >> >> >> >> block
>> >> >> >> >> > of code - do we expect this kind of behavior.
Also with the
>> >> given
>> >> >> >> stack
>> >> >> >> >> > trace I can not tell which line caused NPE
of AIOBE.
>> >> >> >> >> >
>> >> >> >> >> > Thanks.
>> >> >> >> >> >
>> >> >> >> >> > -Shrijeet
>> >> >> >> >> > PS: Line numbers in stack trace may not
match with 0.90.3
>> branch
>> >> >> >> because
>> >> >> >> >> of
>> >> >> >> >> > extra patches we have.
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>>

Mime
View raw message