hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Problem With Zookeeper
Date Wed, 13 Jan 2010 20:34:07 GMT
Oh I see something, it seems that the master is waiting on the file
system in the main thread. Is HDFS running? Is can you create a file?

J-D

On Wed, Jan 13, 2010 at 12:27 PM, Ananth T. Sarathy
<ananth.t.sarathy@gmail.com> wrote:
> here 's what I get
>
> http://pastebin.com/m60c1864b
>
>
> Ananth T Sarathy
>
>
> On Wed, Jan 13, 2010 at 2:57 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> Do "jps" then "jstack pid" with the master's pid given by jps.
>>
>> J-D
>>
>> On Wed, Jan 13, 2010 at 11:41 AM, Ananth T. Sarathy
>> <ananth.t.sarathy@gmail.com> wrote:
>> > well when i do a ps -ef|grep hbase i have 3 processes running. I have
>> killed
>> > them all, reinstalled hbase, formated my name node, and still the
>> master.log
>> > is the same when I restart.  What could be causing it hang?
>> >
>> >
>> > Ananth T Sarathy
>> >
>> >
>> > On Wed, Jan 13, 2010 at 2:26 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> Well it's just weird that your master would just "refuse" to start. Is
>> >> the process still there? If you jstack it, is there any thread
>> >> running?
>> >>
>> >> You could also clean up everything and retry, but that's just the easy
>> >> way out :P
>> >>
>> >> J-D
>> >>
>> >> On Wed, Jan 13, 2010 at 11:23 AM, Ananth T. Sarathy
>> >> <ananth.t.sarathy@gmail.com> wrote:
>> >> > master. out is empty.... could something have cludged up from the
>> >> previous
>> >> > issues? Are there files I should delete/ reformat my namenode?
>> >> >
>> >> > I don't have any data yet in these, so I can afford to blow things
>> away,
>> >> but
>> >> > I cleaned out the tmp dir already so I am not sure what else i need
to
>> >> do.
>> >> > Ananth T Sarathy
>> >> >
>> >> >
>> >> > On Wed, Jan 13, 2010 at 2:14 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> If that's everything from your master log, then I would suggest
you
>> >> >> take a look at the .out file (instead of .log) since it might be
a
>> >> >> problem on startup.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Wed, Jan 13, 2010 at 11:09 AM, Ananth T. Sarathy
>> >> >> <ananth.t.sarathy@gmail.com> wrote:
>> >> >> > Master log
>> >> >> >
>> >> >> > http://pastebin.com/m469d1b39
>> >> >> >
>> >> >> > Zookeeper log
>> >> >> > http://pastebin.com/m47f0503
>> >> >> >
>> >> >> > region server
>> >> >> >
>> >> >> > http://pastebin.com/m305fab14
>> >> >> >
>> >> >> > Ananth T Sarathy
>> >> >> >
>> >> >> >
>> >> >> > On Wed, Jan 13, 2010 at 2:02 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org
>> >> >> >wrote:
>> >> >> >
>> >> >> >> Looks like your master didn't register itself in zookeeper,
you
>> >> should
>> >> >> >> look in its log.
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Wed, Jan 13, 2010 at 10:59 AM, Ananth T. Sarathy
>> >> >> >> <ananth.t.sarathy@gmail.com> wrote:
>> >> >> >> > ok, we got that to work and zookeeper is coming up,
but now I am
>> >> >> getting
>> >> >> >> > something else... the regionserver are connecting
cause  of
>> >> >> >> >
>> >> >> >> > 2010-01-13 13:57:56,029 WARN
>> >> >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer:
Unable to
>> read
>> >> >> master
>> >> >> >> > address from ZooKeeper. Retrying. Error was:
>> >> >> >> > java.io.IOException:
>> >> >> >> org.apache.zookeeper.KeeperException$NoNodeException:
>> >> >> >> > KeeperErrorCode = NoNode for /hbase/master
>> >> >> >> >        at
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(ZooKeeperWrapper.java:332)
>> >> >> >> >        at
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readMasterAddressOrThrow(ZooKeeperWrapper.java:240)
>> >> >> >> >        at
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1339)
>> >> >> >> >        at
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1371)
>> >> >> >> >        at
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:427)
>> >> >> >> >        at java.lang.Thread.run(Thread.java:636)
>> >> >> >> > Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>> >> >> >> > KeeperErrorCode = NoNode for /hbase/master
>> >> >> >> >        at
>> >> >> >> >
>> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>> >> >> >> >        at
>> >> >> >> >
>> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>> >> >> >> >        at
>> >> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:892)
>> >> >> >> >        at
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(ZooKeeperWrapper.java:328)
>> >> >> >> >        ... 5 more
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > any ideas?
>> >> >> >> > Ananth T Sarathy
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Wed, Jan 13, 2010 at 12:52 PM, Jean-Daniel Cryans
<
>> >> >> >> jdcryans@apache.org>wrote:
>> >> >> >> >
>> >> >> >> >> HBase 0.20.2 and previous only checked one address
against the
>> >> list
>> >> >> >> >> that is provided, the one returned was the default
Java knew
>> of.
>> >> It
>> >> >> >> >> seems that in your case your /etc/hosts makes
it that this
>> >> machines
>> >> >> >> >> resolves itself only as localhost. You can:
>> >> >> >> >>
>> >> >> >> >> 1) Try to fix your network configuration to have
your machine
>> >> always
>> >> >> >> >> resolve by its hostname first, or
>> >> >> >> >>
>> >> >> >> >> 2) Use HBase 0.20.3RC1 which contains a fix that
tries harder
>> to
>> >> >> match
>> >> >> >> >> the address. You can get it here:
>> >> >> >> >> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-1/<http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/>
>> <http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/>
>> >> <http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/>
>> >> >> <http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/>
>> >> >> >> <http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/>
>> >> >> >> >>
>> >> >> >> >> Sorry for that,
>> >> >> >> >>
>> >> >> >> >> J-D
>> >> >> >> >>
>> >> >> >> >> On Wed, Jan 13, 2010 at 9:43 AM, Ananth T. Sarathy
>> >> >> >> >> <ananth.t.sarathy@gmail.com> wrote:
>> >> >> >> >> > I have Hbase.env set to manage Zookeeper.
When I try to start
>> >> >> hbase,
>> >> >> >> the
>> >> >> >> >> > zookeeper out says
>> >> >> >> >> >
>> >> >> >> >> > java.io.IOException: Could not find my address:
localhost in
>> >> list
>> >> >> of
>> >> >> >> >> > ZooKeeper quorum servers
>> >> >> >> >> >        at
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> org.apache.hadoop.hbase.zookeeper.HQuorumPeer.writeMyID(HQuorumPeer.java:128)
>> >> >> >> >> >        at
>> >> >> >> >> >
>> >> >> >>
>> >> org.apache.hadoop.hbase.zookeeper.HQuorumPeer.main(HQuorumPeer.java:67)
>> >> >> >> >> > ~
>> >> >> >> >> >
>> >> >> >> >> > in my hbase-site.xml
>> >> >> >> >> >
>> >> >> >> >> >  <property>
>> >> >> >> >> >   <name>hbase.zookeeper.quorum</name>
>> >> >> >> >> >   <value>gs2,gs3,gs4</value>
>> >> >> >> >> >   <description>Comma separated list
of servers in the
>> ZooKeeper
>> >> >> >> Quorum.
>> >> >> >> >> >   For example, "host1.mydomain.com,host2.mydomain.com,
>> >> >> >> host3.mydomain.com
>> >> >> >> >> ".
>> >> >> >> >> >   By default this is set to localhost for
local and
>> >> >> pseudo-distributed
>> >> >> >> >> > modes
>> >> >> >> >> >   of operation. For a fully-distributed
setup, this should be
>> >> set
>> >> >> to a
>> >> >> >> >> full
>> >> >> >> >> >   list of ZooKeeper quorum servers. If
HBASE_MANAGES_ZK is
>> set
>> >> in
>> >> >> >> >> > hbase-env.sh
>> >> >> >> >> >   this is the list of servers which we
will start/stop
>> ZooKeeper
>> >> >> on.
>> >> >> >> >> >   </description>
>> >> >> >> >> >  </property>
>> >> >> >> >> >
>> >> >> >> >> > in my /etc/hosts
>> >> >> >> >> >
>> >> >> >> >> > # hostname gs2 added to /etc/hosts by anaconda
>> >> >> >> >> > 127.0.0.1   localhost localhost.localdomain
localhost4
>> >> >> >> >> > localhost4.localdomain4 gs2
>> >> >> >> >> > ::1         localhost localhost.localdomain
localhost6
>> >> >> >> >> > localhost6.localdomain6 gs2
>> >> >> >> >> >
>> >> >> >> >> > 192.168.20.101 gs1
>> >> >> >> >> > 192.168.20.102 gs2
>> >> >> >> >> > 192.168.20.103 gs3
>> >> >> >> >> > 192.168.20.104 gs4
>> >> >> >> >> > 192.168.20.105 gs5
>> >> >> >> >> > 192.168.20.106 gs6
>> >> >> >> >> > 192.168.20.107 gs7
>> >> >> >> >> > 192.168.20.108 gs8
>> >> >> >> >> > 192.168.20.110 gs10
>> >> >> >> >> > 192.168.20.111 gs11
>> >> >> >> >> > 192.168.20.112 gs12
>> >> >> >> >> > 192.168.20.113 gs13
>> >> >> >> >> > 192.168.20.114 gs14
>> >> >> >> >> > 192.168.20.115 gs15
>> >> >> >> >> > 192.168.20.116 gs16
>> >> >> >> >> > 192.168.20.117 gs17
>> >> >> >> >> >
>> >> >> >> >> > am I missing something here? Why does it
insist on localhost
>> in
>> >> the
>> >> >> >> >> quorum
>> >> >> >> >> > list? What do i need to do to unconfuse
it?
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > Ananth T Sarathy
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Mime
View raw message