hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: initial experience with HBase 0.90.1 rc0
Date Fri, 11 Feb 2011 00:54:22 GMT
Thanks for the explanation.
Assuming the mixed class loading is static, why did this situation develop
after 40 minutes of heavy load :-(

On Thu, Feb 10, 2011 at 4:42 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:

> It's a standard linking issue, you get one class from one version
> another from another, they are mostly compatible in terms of
> signatures (hence no exceptions) but are subtly incompatible in
> different ways. In the stack trace you posted, the handlers were
> blocked in:
>
>        at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:382)
>
> and the thread:
>
> "regionserver60020.cacheFlusher" daemon prio=10 tid=0x00002aaabc21e000
> nid=0x7717 waiting for monitor entry [0x0000000000000000]
>   java.lang.Thread.State: BLOCKED (on object monitor)
>
> was idle.
>
> The cache flusher thread should be flushing, and yet it's doing
> nothing.  This also happens to be one of the classes that were
> changed.
>
>
>
> On Thu, Feb 10, 2011 at 4:34 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > Can someone comment on my second question ?
> > Thanks
> >
> > On Thu, Feb 10, 2011 at 4:25 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
> >
> >> As I suspected.
> >>
> >> It's a byproduct of our maven assembly process. The process could be
> >> fixed. I wouldn't mind. I don't support runtime checking of jars,
> >> there is such thing as too much tests, and this is an example of it.
> >> The check would then need a test, etc, etc.
> >>
> >> At SU we use new directories for each upgrade, copying the config
> >> over. With the lack of -default.xml this is easier than ever (just
> >> copy everything in conf/).  With symlink switchover it makes roll
> >> forward/back as simple as doing a symlink switchover or back. I have
> >> to recommend this to everyone who doesnt have a management scheme.
> >>
> >> On Thu, Feb 10, 2011 at 4:20 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >> > hbase/hbase-0.90.1.jar leads lib/hbase-0.90.0.jar in the classpath.
> >> > I wonder
> >> > 1. why hbase jar is placed in two directories - 0.20.6 didn't use such
> >> > structure
> >> > 2. what from lib/hbase-0.90.0.jar could be picked up and why there
> wasn't
> >> > exception in server log
> >> >
> >> > I think a JIRA should be filed for item 2 above - bail out when the
> two
> >> > hbase jars from $HBASE_HOME and $HBASE_HOME/lib are of different
> >> versions.
> >> >
> >> > Cheers
> >> >
> >> > On Thu, Feb 10, 2011 at 3:40 PM, Ryan Rawson <ryanobjc@gmail.com>
> wrote:
> >> >
> >> >> What do you get when you:
> >> >>
> >> >> ls lib/hbase*
> >> >>
> >> >> I'm going to guess there is hbase-0.90.0.jar there
> >> >>
> >> >>
> >> >>
> >> >> On Thu, Feb 10, 2011 at 3:25 PM, Ted Yu <yuzhihong@gmail.com>
wrote:
> >> >> > hbase-0.90.0-tests.jar and hbase-0.90.1.jar co-exist
> >> >> > Would this be a problem ?
> >> >> >
> >> >> > On Thu, Feb 10, 2011 at 3:16 PM, Ryan Rawson <ryanobjc@gmail.com>
> >> wrote:
> >> >> >
> >> >> >> You don't have both the old and the new hbase jars in there
do
> you?
> >> >> >>
> >> >> >> -ryan
> >> >> >>
> >> >> >> On Thu, Feb 10, 2011 at 3:12 PM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> >> >> >> > .META. went offline during second flow attempt.
> >> >> >> >
> >> >> >> > The time out I mentioned happened for 1st and 3rd attempts.
> HBase
> >> was
> >> >> >> > restarted before the 1st and 3rd attempts.
> >> >> >> >
> >> >> >> > Here is jstack:
> >> >> >> > http://pastebin.com/EHMSvsRt
> >> >> >> >
> >> >> >> > On Thu, Feb 10, 2011 at 3:04 PM, Stack <stack@duboce.net>
> wrote:
> >> >> >> >
> >> >> >> >> So, .META. is not online?  What happens if you use
shell at
> this
> >> >> time.
> >> >> >> >>
> >> >> >> >> Your attachement did not come across Ted.  Mind postbin'ing
it?
> >> >> >> >>
> >> >> >> >> St.Ack
> >> >> >> >>
> >> >> >> >> On Thu, Feb 10, 2011 at 2:41 PM, Ted Yu <yuzhihong@gmail.com>
> >> wrote:
> >> >> >> >> > I replaced hbase jar with hbase-0.90.1.jar
> >> >> >> >> > I also upgraded client side jar to hbase-0.90.1.jar
> >> >> >> >> >
> >> >> >> >> > Our map tasks were running faster than before
for about 50
> >> minutes.
> >> >> >> >> However,
> >> >> >> >> > map tasks then timed out calling flushCommits().
This
> happened
> >> even
> >> >> >> after
> >> >> >> >> > fresh restart of hbase.
> >> >> >> >> >
> >> >> >> >> > I don't see any exception in region server logs.
> >> >> >> >> >
> >> >> >> >> > In master log, I found:
> >> >> >> >> >
> >> >> >> >> > 2011-02-10 18:24:15,286 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
> >> Opened
> >> >> >> region
> >> >> >> >> > -ROOT-,,0.70236052 on sjc1-hadoop6.X.com,60020,1297362251595
> >> >> >> >> > 2011-02-10 18:24:15,349 INFO
> >> >> >> >> org.apache.hadoop.hbase.catalog.CatalogTracker:
> >> >> >> >> > Failed verification of .META.,,1 at address=null;
> >> >> >> >> > org.apache.hadoop.hbase.NotServingRegionException:
> >> >> >> >> > org.apache.hadoop.hbase.NotServingRegionException:
Region is
> not
> >> >> >> online:
> >> >> >> >> > .META.,,1
> >> >> >> >> > 2011-02-10 18:24:15,350 DEBUG
> >> >> >> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >> >> >> > master:60000-0x12e10d0e31e0000 Creating (or
updating)
> unassigned
> >> >> node
> >> >> >> for
> >> >> >> >> > 1028785192 with OFFLINE state
> >> >> >> >> >
> >> >> >> >> > I am attaching region server (which didn't respond
to
> >> >> stop-hbase.sh)
> >> >> >> >> jstack.
> >> >> >> >> >
> >> >> >> >> > FYI
> >> >> >> >> >
> >> >> >> >> > On Thu, Feb 10, 2011 at 10:10 AM, Stack <stack@duboce.net>
> >> wrote:
> >> >> >> >> >>
> >> >> >> >> >> Thats probably enough Ted.  The 0.90.1 hbase-default.xml
has
> an
> >> >> extra
> >> >> >> >> >> config. to enable the experimental HBASE-3455
feature but
> you
> >> can
> >> >> >> copy
> >> >> >> >> >> that over if you want to try playing with
it (it defaults
> off
> >> so
> >> >> >> you'd
> >> >> >> >> >> copy over the config. if you wanted to set
it to true).
> >> >> >> >> >>
> >> >> >> >> >> St.Ack
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >
> >> >> >>
> >> >> >
> >> >>
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message