hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lv <lvzheng19800...@gmail.com>
Subject Re: Cannot open filename Exceptions
Date Tue, 16 Mar 2010 07:17:22 GMT
Hello Stack,
  I have uploaded some parts of the logs on master, regionserver208 and
regionserver210 to:
  http://rapidshare.com/files/363988384/master_207_log.txt.html
  http://rapidshare.com/files/363988673/regionserver_208_log.txt.html
  http://rapidshare.com/files/363988819/regionserver_210_log.txt.html
  I noticed that there are some LeaseExpiredException and "2010-03-15
16:06:32,864 ERROR org.apache.hadoop.hbase.regionserver.CompactSplitThread:
Compaction/Split failed for region ..." before 17 oclock. Did these lead to
the error? Why did these happened? How to avoid these?
  Thanks.
  LvZheng
2010/3/16 Stack <stack@duboce.net>

> Maybe just the master log would be sufficient from around this time to
> figure the story.
> St.Ack
>
> On Mon, Mar 15, 2010 at 10:04 PM, Stack <stack@duboce.net> wrote:
> > Hey Zheng:
> >
> > On Mon, Mar 15, 2010 at 8:16 PM, Zheng Lv <lvzheng19800619@gmail.com>
> wrote:
> >> Hello Stack,
> >>  After we got these exceptions, we restart the cluster and restarted the
> >> job that failed, and the job succeeded.
> >>  Now when we access
> /hbase/summary/1491233486/metrics/5046821377427277894,
> >> we got " Cannot access
> >> /hbase/summary/1491233486/metrics/5046821377427277894: No such file or
> >> directory." .
> >>
> >
> > So, that would seem to indicate that the reference was in memory
> > only.. that file was not in filesystem.  You could have tried closing
> > that region.   It would have been interesting also to find history on
> > that region, to try and figure how it came to hold in memory a
> > reference to a file since removed.
> >
> >>  The messages about this file in namenode logs are in here:
> >> http://rapidshare.com/files/363938595/log.txt.html
> >
> > This is interesting.  Do you have regionserver logs from 209, 208, and
> > 210 for corresponding times?
> >
> > Thanks,
> > St.Ack
> >
> >>  The job failed startted about at 17 o'clock.
> >>  By the way, the hadoop version we are using is 0.20.1, the hbase
> version
> >> we are using is 0.20.3.
> >>
> >>  Regards,
> >>  LvZheng
> >> 2010/3/16 Stack <stack@duboce.net>
> >>
> >>> Can you get that file from hdfs?
> >>>
> >>> > ./bin/hadoop fs -get
> >>>  /hbase/summary/1491233486/metrics/5046821377427277894
> >>>
> >>> Does it look wholesome?  Is it empty?
> >>>
> >>> What if you trace the life of that file in regionserver logs or
> >>> probably better, over in namenode log?  If you move this file aside,
> >>> the region deploys?
> >>>
> >>> St.Ack
> >>>
> >>> On Mon, Mar 15, 2010 at 3:40 AM, Zheng Lv <lvzheng19800619@gmail.com>
> >>> wrote:
> >>> > Hello Everyone,
> >>> >    Recently we often got these in our client logs:
> >>> >    org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying
> to
> >>> > contact region server 172.16.1.208:60020 for region
> >>> >
> >>>
> summary,SITE_0000000032\x01pt\x0120100314000000\x01\x25E7\x258C\x25AE\x25E5\x258E\x25BF\x25E5\x2586\x2580\x25E9\x25B9\x25B0\x25E6\x2591\x25A9\x25E6\x2593\x25A6\x25E6\x259D\x2590\x25E6\x2596\x2599\x25E5\x258E\x2582\x2B\x25E6\x25B1\x25BD\x25E8\x25BD\x25A6\x25E9\x2585\x258D\x25E4\x25BB\x25B6\x25EF\x25BC\x258C\x25E5\x2598\x2580\x25E9\x2593\x2583\x25E9\x2593\x2583--\x25E7\x259C\x259F\x25E5\x25AE\x259E\x25E5\x25AE\x2589\x25E5\x2585\x25A8\x25E7\x259A\x2584\x25E7\x2594\x25B5\x25E8\x25AF\x259D\x25E3\x2580\x2581\x25E7\x25BD\x2591\x25E7\x25BB\x259C\x25E4\x25BA\x2592\x25E5\x258A\x25A8\x25E4\x25BA\x25A4\x25E5\x258F\x258B\x25E7\x25A4\x25BE\x25E5\x258C\x25BA\x25EF\x25BC\x2581,1268640385017,
> >>> > row
> >>> >
> >>>
> 'SITE_0000000032\x01pt\x0120100315000000\x01\x2521\x25EF\x25BC\x2581\x25E9\x2594\x2580\x25E5\x2594\x25AE\x252F\x25E6\x2594\x25B6\x25E8\x25B4\x25AD\x25EF\x25BC\x2581VM700T\x2BVM700T\x2B\x25E5\x259B\x25BE\x25E5\x2583\x258F\x25E4\x25BF\x25A1\x25E5\x258F\x25B7\x25E4\x25BA\x25A7\x25E7\x2594\x259F\x25E5\x2599\x25A8\x2B\x25E7\x2594\x25B5\x25E5\x25AD\x2590\x25E6\x25B5\x258B\x25E9\x2587\x258F\x25E4\x25BB\x25AA\x25E5\x2599\x25A8\x25EF\x25BC\x258C\x25E5\x2598\x2580\x25E9\x2593\x2583\x25E9\x2593\x2583--\x25E7\x259C\x259F\x25E5\x25AE\x259E\x25E5\x25AE\x2589\x25E5\x2585\x25A8\x25E7\x259A\x2584\x25E7\x2594\x25B5\x25E8\x25AF\x259D\x25E3\x2580\x2581\x25E7\x25BD\x2591\x25E7\x25BB\x259C\x25E4\x25BA\x2592\x25E5\x258A\x25A8\x25E4\x25BA\x25A4\x25E5\x258F\x258B\x25E7\x25A4\x25BE\x25E5\x258C\x25BA\x25EF\x25BC\x2581',
> >>> > but failed after 10 attempts.
> >>> > Exceptions:
> >>> > java.io.IOException: java.io.IOException: Cannot open filename
> >>> > /hbase/summary/1491233486/metrics/5046821377427277894
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1474)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1800)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1616)
> >>> > at
> >>>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1743)
> >>> > at java.io.DataInputStream.read(DataInputStream.java:132)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:99)
> >>> > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100)
> >>> > at
> >>>
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1020)
> >>> > at
> >>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:971)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.loadBlock(HFile.java:1304)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1186)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hbase.io.HalfHFileReader$1.seekTo(HalfHFileReader.java:207)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hbase.regionserver.StoreFileGetScan.getStoreFile(StoreFileGetScan.java:80)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hbase.regionserver.StoreFileGetScan.get(StoreFileGetScan.java:65)
> >>> > at org.apache.hadoop.hbase.regionserver.Store.get(Store.java:1461)
> >>> > at
> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2396)
> >>> > at
> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2385)
> >>> > at
> >>> >
> >>>
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1731)
> >>> > at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
> >>> > at
> >>> >
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>> > at java.lang.reflect.Method.invoke(Method.java:597)
> >>> > at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
> >>> > at
> >>>
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> >>> >    Is there any way to fix this problem? Or is there anything we can
> do
> >>> > even manually to relieve it?
> >>> >    Any suggestion?
> >>> >    Thank you.
> >>> >    LvZheng
> >>> >
> >>>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message