hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ananth T. Sarathy" <ananth.t.sara...@gmail.com>
Subject Re: hbase on s3 and safemode
Date Wed, 07 Oct 2009 18:04:03 GMT
i suppose we need to, but for now it's kind of a pain because we need to
coordinate our clients.

 But the problem is why was it working and all of the sudden it's stuck in
safemode and how to can get back up?

Ananth T Sarathy


On Wed, Oct 7, 2009 at 1:58 PM, stack <stack@duboce.net> wrote:

> Can you update to 0.20.0? (Oodles of improvements).
> St.Ack
>
> On Wed, Oct 7, 2009 at 10:56 AM, Ananth T. Sarathy <
> ananth.t.sarathy@gmail.com> wrote:
>
> > I get an error
> >
> > hbase(main):001:0> status "detailed"
> > NoMethodError: undefined method `status' for #<Object:0x5585c0de>
> >        from (hbase):2
> > hbase(main):002:0> status "detailed"
> > NoMethodError: undefined method `status' for #<Object:0x5585c0de>
> >        from (hbase):3
> >
> >
> > we are running 0.19.3
> >
> > Ananth T Sarathy
> >
> >
> > On Wed, Oct 7, 2009 at 1:51 PM, stack <stack@duboce.net> wrote:
> >
> > > This state persists even if you shutdown hbase and zk and restart?
> > >
> > > In shell, do:
> > >
> > > > status "detailed"
> > >
> > > At the top there is a section which says regions in transistion.
> >  Anything
> > > there?
> > >
> > > St.Ack
> > >
> > >
> > > On Wed, Oct 7, 2009 at 10:35 AM, Ananth T. Sarathy <
> > > ananth.t.sarathy@gmail.com> wrote:
> > >
> > > > Here is the log  since I started it...
> > > >
> > > > Wed Oct  7 13:27:26 EDT 2009 Starting master on ip-10-244-9-171
> > > > ulimit -n 1024
> > > > 2009-10-07 13:27:26,404 INFO org.apache.hadoop.hbase.master.HMaster:
> > > > vmName=Java HotSpot(TM) 64-Bit Server VM, vmVendor=Sun Microsystems
> > Inc.,
> > > > vmVersion=14.2-b01
> > > > 2009-10-07 13:27:26,405 INFO org.apache.hadoop.hbase.master.HMaster:
> > > > vmInputArguments=[-Xmx2000m, -XX:+HeapDumpOnOutOfMemoryError,
> > > > -Djava.io.tmpdir=/mnt/tmp,
> > > > -Dhbase.log.dir=/mnt/apps/hadoop/hbase/bin/../logs,
> > > > -Dhbase.log.file=hbase-root-master-ip-10-244-9-171.log,
> > > > -Dhbase.home.dir=/mnt/apps/hadoop/hbase/bin/.., -Dhbase.id.str=root,
> > > > -Dhbase.root.logger=INFO,DRFA,
> > > >
> > > >
> > >
> >
> -Djava.library.path=/mnt/apps/hadoop/hbase/bin/../lib/native/Linux-amd64-64]
> > > > 2009-10-07 13:27:27,525 INFO org.apache.hadoop.hbase.master.HMaster:
> > Root
> > > > region dir: s3://
> hbase2.s3.amazonaws.com:80/hbasedata/-ROOT-/70236052
> > > > 2009-10-07<
> > >
> http://hbase2.s3.amazonaws.com:80/hbasedata/-ROOT-/70236052%0A2009-10-07
> > >13:27:27,751
> > > INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics:
> > > > Initializing RPC Metrics with hostName=HMaster, port=60000
> > > > 2009-10-07 13:27:27,827 INFO org.apache.hadoop.hbase.master.HMaster:
> > > > HMaster
> > > > initialized on 10.244.9.171:60000
> > > > 2009-10-07 13:27:27,829 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics:
> > > > Initializing JVM Metrics with processName=Master, sessionId=HMaster
> > > > 2009-10-07 13:27:27,830 INFO
> > > > org.apache.hadoop.hbase.master.metrics.MasterMetrics: Initialized
> > > > 2009-10-07 13:27:27,932 INFO org.mortbay.util.Credential: Checking
> > > Resource
> > > > aliases
> > > > 2009-10-07 13:27:27,936 INFO org.mortbay.http.HttpServer: Version
> > > > Jetty/5.1.4
> > > > 2009-10-07 13:27:27,936 INFO org.mortbay.util.Container: Started
> > > > HttpContext[/logs,/logs]
> > > > 2009-10-07 13:27:28,202 INFO org.mortbay.util.Container: Started
> > > > org.mortbay.jetty.servlet.WebApplicationHandler@3209fa8f
> > > > 2009-10-07 13:27:28,244 INFO org.mortbay.util.Container: Started
> > > > WebApplicationContext[/static,/static]
> > > > 2009-10-07 13:27:28,361 INFO org.mortbay.util.Container: Started
> > > > org.mortbay.jetty.servlet.WebApplicationHandler@b0c0f66
> > > > 2009-10-07 13:27:28,364 INFO org.mortbay.util.Container: Started
> > > > WebApplicationContext[/,/]
> > > > 2009-10-07 13:27:28,636 INFO org.mortbay.util.Container: Started
> > > > org.mortbay.jetty.servlet.WebApplicationHandler@3c2d7440
> > > > 2009-10-07 13:27:28,638 INFO org.mortbay.util.Container: Started
> > > > WebApplicationContext[/api,rest]
> > > > 2009-10-07 13:27:28,639 INFO org.mortbay.http.SocketListener: Started
> > > > SocketListener on 0.0.0.0:60010
> > > > 2009-10-07 13:27:28,639 INFO org.mortbay.util.Container: Started
> > > > org.mortbay.jetty.Server@28b301f2
> > > > 2009-10-07 13:27:28,640 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > Responder: starting
> > > > 2009-10-07 13:27:28,641 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > listener on 60000: starting
> > > > 2009-10-07 13:27:28,641 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 0 on 60000: starting
> > > > 2009-10-07 13:27:28,641 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 1 on 60000: starting
> > > > 2009-10-07 13:27:28,641 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 2 on 60000: starting
> > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 3 on 60000: starting
> > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 4 on 60000: starting
> > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 5 on 60000: starting
> > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 6 on 60000: starting
> > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 7 on 60000: starting
> > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 8 on 60000: starting
> > > > 2009-10-07 13:27:28,642 DEBUG org.apache.hadoop.hbase.master.HMaster:
> > > > Started service threads
> > > > 2009-10-07 13:27:28,643 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> > > Server
> > > > handler 9 on 60000: starting
> > > > 2009-10-07 13:28:09,519 INFO
> > > org.apache.hadoop.hbase.master.RegionManager:
> > > > in safe mode
> > > > 2009-10-07 13:28:11,542 INFO
> > > org.apache.hadoop.hbase.master.RegionManager:
> > > > in safe mode
> > > > 2009-10-07 13:28:13,543 INFO
> > > org.apache.hadoop.hbase.master.RegionManager:
> > > > in safe mode
> > > > 2009-10-07 13:28:15,545 INFO
> > > org.apache.hadoop.hbase.master.RegionManager:
> > > > in safe mode
> > > > 2009-10-07 13:28:17,548 INFO
> > > org.apache.hadoop.hbase.master.RegionManager:
> > > > in safe mode
> > > > 2009-10-07 13:28:19,555 INFO
> > > org.apache.hadoop.hbase.master.RegionManager:
> > > > in safe mode
> > > > 2009-10-07 13:28:27,834 INFO
> > org.apache.hadoop.hbase.master.BaseScanner:
> > > > All
> > > > 0 .META. region(s) scanned
> > > > 2009-10-07 13:29:27,832 INFO
> > org.apache.hadoop.hbase.master.BaseScanner:
> > > > All
> > > > 0 .META. region(s) scanned
> > > > 2009-10-07 13:29:37,593 INFO
> > > org.apache.hadoop.hbase.master.RegionManager:
> > > > in safe mode
> > > > 2009-10-07 13:30:27,834 INFO
> > org.apache.hadoop.hbase.master.BaseScanner:
> > > > All
> > > > 0 .META. region(s) scanned
> > > > 2009-10-07 13:31:27,836 INFO
> > org.apache.hadoop.hbase.master.BaseScanner:
> > > > All
> > > > 0 .META. region(s) scanned
> > > > 2009-10-07 13:32:27,838 INFO
> > org.apache.hadoop.hbase.master.BaseScanner:
> > > > All
> > > > 0 .META. region(s) scanned
> > > > 2009-10-07 13:33:27,840 INFO
> > org.apache.hadoop.hbase.master.BaseScanner:
> > > > All
> > > > 0 .META. region(s) scanned
> > > >
> > > >
> > > > Ananth T Sarathy
> > > >
> > > >
> > > > On Wed, Oct 7, 2009 at 1:20 PM, stack <stack@duboce.net> wrote:
> > > >
> > > > > Thats interesting to hear.  Keep us posted.
> > > > >
> > > > > HBase asks the filesystem if its in safe mode and if it is, it
> parks
> > > > > itself.  Here is code from master:
> > > > >
> > > > >    if (this.fs instanceof DistributedFileSystem) {
> > > > >      // Make sure dfs is not in safe mode
> > > > >      String message = "Waiting for dfs to exit safe mode...";
> > > > >      while (((DistributedFileSystem) fs).setSafeMode(
> > > > >          FSConstants.SafeModeAction.SAFEMODE_GET)) {
> > > > >        LOG.info(message);
> > > > >        try {
> > > > >          Thread.sleep(this.threadWakeFrequency);
> > > > >        } catch (InterruptedException e) {
> > > > >          //continue
> > > > >        }
> > > > >      }
> > > > >    }
> > > > >
> > > > >
> > > > > Then there is hbase's notion of safemode.  It will be in safe mode
> > > until
> > > > it
> > > > > does initial scan of catalog tables.  The master keeps a flag in
> > > > zookeeper
> > > > > while its in safemode so regionservers are aware of the state:
> > > > >
> > > > >  public boolean inSafeMode() {
> > > > >    if (safeMode) {
> > > > >      if(isInitialMetaScanComplete() && regionsInTransition.size()
> ==
> > 0
> > > &&
> > > > >         tellZooKeeperOutOfSafeMode()) {
> > > > >        master.connection.unsetRootRegionLocation();
> > > > >        safeMode = false;
> > > > >        LOG.info("exiting safe mode");
> > > > >      } else {
> > > > >        LOG.info("in safe mode");
> > > > >      }
> > > > >    }
> > > > >    return safeMode;
> > > > >  }
> > > > >
> > > > > Have you seen the .META. and -ROOT- deploy to regionservers?  Have
> > you
> > > > seen
> > > > > that these regions being scanned in the master log?  (Enable DEBUG
> if
> > > not
> > > > > already enabled).
> > > > >
> > > > > Yours,
> > > > > ST.Ack
> > > > >
> > > > >
> > > > > On Wed, Oct 7, 2009 at 10:06 AM, Ananth T. Sarathy <
> > > > > ananth.t.sarathy@gmail.com> wrote:
> > > > >
> > > > > > We have been running Hbase on a s3 filesystem. It's the hbase
> > > > > regionserver,
> > > > > > not HDFS since we are using s3.  We haven't felt like it's been
> too
> > > > slow,
> > > > > > though the amount of data we are pushing isn't sufficiently
large
> > > > enough
> > > > > to
> > > > > > notice yet.
> > > > > > Ananth T Sarathy
> > > > > >
> > > > > >
> > > > > > On Wed, Oct 7, 2009 at 12:47 PM, stack <stack@duboce.net>
wrote:
> > > > > >
> > > > > > > HBase or HDFS is in safe mode.  My guess is that its the
> latter.
> > > > Can
> > > > > > you
> > > > > > > figure from HDFS logs why it won't leave safe mode?  Usually
> > > > > > > under-replication or a loss of a large swath of the cluster
> will
> > > flip
> > > > > on
> > > > > > > the
> > > > > > > safe-mode switch.
> > > > > > >
> > > > > > > Are you trying to run HBASE on an S3 filesystem?  An HBasista
> > tried
> > > > it
> > > > > in
> > > > > > > the past and, FYI, found it insufferably slow.  Let us
know how
> > it
> > > > goes
> > > > > > for
> > > > > > > you.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > St.Ack
> > > > > > >
> > > > > > > On Wed, Oct 7, 2009 at 9:33 AM, Ananth T. Sarathy <
> > > > > > > ananth.t.sarathy@gmail.com> wrote:
> > > > > > >
> > > > > > > > my  regionserver has been stuck in safemode. What
can i do to
> > get
> > > > it
> > > > > > out
> > > > > > > > safemode?
> > > > > > > >
> > > > > > > > Ananth T Sarathy
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message