hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: is there any problem with our environment?
Date Tue, 13 Oct 2009 00:22:56 GMT
Thanks for posting.  Its much easier reading the logs from there.

Looking in nohup.out I see it can't find region 'webpage,http:\x2F\
x2Fnews.163.com\x2F09\x2F080\x2F0\x2F5FOO155J0001124J.html1255072992000_751685,1255316061169'.
It never finds it.   It looks like it was assigned successfully to
192.168.33.5 going by the master log.  Once you've figured out the
hardware/networking issues, lets work at getting that region back on line.

The master timed out its session against zk because of 'no route to host'.

St.Ack

On Mon, Oct 12, 2009 at 12:23 AM, Zheng Lv <lvzheng19800619@gmail.com>wrote:

> Hello Stack,
>    I have enabled DEBUG and restarted the test program. This time the
> master shut down, and I have put the logs on skydrive.
>
> http://cid-a331bb289a14fbef.skydrive.live.com/browse.aspx/.Public?uc=2&isFromRichUpload=1
> .
>    "nohup.out" is our test program log, "hbase-cyd-master-ubuntu6.log" is
> master log.
>
>    On the other hand, today we found that when we run "dmesg", there were
> many logs like "[3641697.122769] r8169: eth0: link down". And I think this
> might be the reason of so many "no route to host" and "Time Out". Now our
> system manager is checking, if we have a result we will let you know.:)
>    Thanks,
>    LvZheng.
>
> 2009/10/11 stack <stack@duboce.net>
>
> > On Fri, Oct 9, 2009 at 3:18 AM, Zheng Lv <lvzheng19800619@gmail.com>
> > wrote:
> >
> > > ...
> > > so,
> > >    > please remove the delay so hbase fails faster so it doesn't take
> so
> > > long to
> > >    > figure the issue.
> > >    > Are you inserting every 10ms because hbase is falling over on you?
> >  If
> > >    Yes I inserted every 10ms because I'm afraid hbase would fall over.
> > Now
> > > I have removed the delay.
> > >
> > >    After doing these, We have run the test program again, and one
> region
> > > server shut down after about 2 hours, another one 3.
> > >    I will post the logs on these two servers in following reply mails.
> > >
> > >
> > Thanks for doing the above.
> >
> > For the future, debugging, please enable DEBUG and put your logs
> somewhere
> > where I can pull them or put them up in pastebin.  Logs in email messages
> > are hard to follow.  Thanks.
> >
> >
> > >    > Ok.  So this is hbase 0.20.0?  Tell us about your hardware.  What
> > kind
> > > is
> > >    > it?  CPU/RAM/Disks.
> > >     Yes we are using  hbase 0.20.0. And the following is our hardware:
> > >
> > >    CPU:amd x3 710
> > >    RAM:8g ddr2 800
> > >    Disk:270g(raid0)
> > >
> > >
> > Thats an interesting chip -- 3 cores!  The above should be fine as long
> as
> > you coral your mapreduce jobs running on same cluster.
> >
> >
> >
> >
> > >    We have 7 servers with above hardware, one for master, three for
> > > namenodes / regionservers, and the other 3 for zks.
> > >    By the way, what kind of hardware and environment do you suggest we
> > > have?
> > >
> >
> >
> > This configuration seems fine to start with.  Later we might experiment
> > running zk on same machines as regionservers and then up number of
> > regionservers to 6 and up the quorum members to 5.
> >
> > St.Ack
> >
> >
> > >
> > >    Thank you, very much.
> > >    LvZheng.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message