hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucas Nazário dos Santos <nazario.lu...@gmail.com>
Subject Re: HBase hangs
Date Fri, 03 Jul 2009 01:38:04 GMT
Thanks Andy! Doing this right now.

I'll test and let you know the outcome.

Lucas




On Thu, Jul 2, 2009 at 8:01 PM, Andrew Purtell <apurtell@apache.org> wrote:

> Yes. This:
>
> storageID=DS-1037027782$
> java.io.FileNotFoundException:
>
> /usr/local/hadoop_data/hadoop-root/dfs/data/current/subdir13/blk_684021770465535076
> >>>>> (Too many open files)
>
>
> means you need to increase the file descriptor limit for the user under
> which you run your DataNode processes. For example, one common method is to
> set 'nofile' limits in /etc/security/limits.conf to a larger multiple of 2,
> perhaps 10240. Both hard and soft limits need to be set for the setting to
> take effect. Mine is:
>
>   hadoop soft nofile 10240
>   hadoop hard nofile 10240
>
>   - Andy
>
>
>
> ________________________________
> From: Lucas Nazário dos Santos <nazario.lucas@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Thursday, July 2, 2009 10:25:57 AM
> Subject: Re: HBase hangs
>
> Thanks for all Andy.
>
> I've had a look into other log files and found some strange messages.
> Hadoop's datanode log produced the erros bellow by the time HBase became
> unavailable.
>
> Does it help?
>
> I'll have a look at your suggestions and give them a shot.
>
> Thanks,
> Lucas
>
>
>
> NODE 192.168.1.2
>
> 2009-07-02 05:52:34,999 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 192.168.1.2:50010,
>
> storageID=DS-395520527-$
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
>
> java.nio.channels.SocketChannel[$
>        at
>
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:185)
>        at
>
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>        at
>
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>        at java.lang.Thread.run(Thread.java:619)
>
> 2009-07-02 05:52:34,999 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 192.168.1.2:50010,
>
> storageID=DS-395520527$
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
>
> java.nio.channels.SocketChannel[$
>        at
>
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:185)
>        at
>
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>        at
>
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>        at java.lang.Thread.run(Thread.java:619)
>
>
>
> NODE 192.168.1.3
>
> 2009-07-02 04:27:06,643 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 192.168.1.3:50010,
>
> storageID=DS-1037027782$
> java.io.FileNotFoundException:
>
> /usr/local/hadoop_data/hadoop-root/dfs/data/current/subdir13/blk_684021770465535076
> (Too many open files)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:738)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:166)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>        at java.lang.Thread.run(Thread.java:619)
>
> 2009-07-02 04:27:06,644 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 192.168.1.3:50010,
>
> storageID=DS-103702778$
> java.io.FileNotFoundException:
>
> /usr/local/hadoop_data/hadoop-root/dfs/data/current/subdir13/blk_684021770465535076
> (Too many open files)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:738)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:166)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:172)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>        at java.lang.Thread.run(Thread.java:619)
>
>
>
>
>
> On Thu, Jul 2, 2009 at 1:06 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > Hi,
> >
> > Are there related exceptions in your DataNode logs?
> >
> > There are some HDFS related troubleshooting steps up on the wiki:
> >    http://wiki.apache.org/hadoop/Hbase/Troubleshooting
> >
> > Have you increased the number of file descriptors available to the Data
> > Nodes? For example, one common method is to set 'nofile' limits in
> > /etc/security/limits.conf to a larger multiple of 2, perhaps 10240.
> >
> > Have you added a setting of dfs.datanode.max.xcievers (in
> hadoop-site.xml)
> > to a larger value than the default (256)? For example, 1024 or 2048?
> >
> >   - Andy
> >
> >
> >
> > On Thu, Jul 2, 2009 at 11:40 AM, Lucas Nazário dos Santos wrote:
> >
> > > Hi,
> > >
> > > It's the second time it happens. I have a Hadoop job that reads and
> > inserts
> > > data into HBase. It works perfectly for a couple of hours and then
> HBase
> > > hangs.
> > >
> > > I'm using HBase 0.19.3 and Hadoop 0.19.1.
> > >
> > > Interesting is that the list command shows the table, but a count
> returns
> > > an exception.
> > >
> > > Bellow is the error log.
> > >
> > > Does anybody know what is happening?
> > >
> > > Lucas
> > >
> > >
> > >
> > > 2009-07-02 05:38:02,337 INFO
> > > org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_FLUSH:
> > > document,,1246496132379: safeMode=false
> > > 2009-07-02 05:38:02,337 INFO
> > > org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
> > > MSG_REGION_FLUSH: document,,1246496132379: safeMode=false
> > > 2009-07-02 05:40:11,518 WARN org.apache.hadoop.hdfs.DFSClient:
> Exception
> > > while reading from blk_4097294633794140351_1008 of
> > > /hbase/-ROOT-/70236052/info/mapfiles/6567566389528605238/index from
> > > 192.168.1.3:50010: java.io.IOException: Premeture EOF from inputStream
> > >     at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:102)
> > >     at
> > >
> >
> org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1207)
> > >     at
> > >
> >
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
> > >     at
> org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:177)
> > >     at
> org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:194)
> > >     at
> org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
> > >     at
> > > org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
> > >     at
> > >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615)
> > >     at
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665)
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:178)
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.init(SequenceFile.java:1464)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1442)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1431)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1426)
> > >     at org.apache.hadoop.hbase.io.MapFile$Reader.open(MapFile.java:318)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.<init>(HBaseMapFile.java:78)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.BloomFilterMapFile$Reader.<init>(BloomFilterMapFile.java:68)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:443)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:127)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.<init>(StoreFileScanner.java:65)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HStoreScanner.<init>(HStoreScanner.java:92)
> > >     at
> > >
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2134)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$HScanner.<init>(HRegion.java:2000)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1187)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1714)
> > >     at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
> > >     at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
> > >     at
> > >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> > >
> > > 2009-07-02 05:41:11,517 WARN org.apache.hadoop.hdfs.DFSClient:
> Exception
> > > while reading from blk_4097294633794140351_1008 of
> > > /hbase/-ROOT-/70236052/info/mapfiles/6567566389528605238/index from
> > > 192.168.1.3:50010: java.io.IOException: Premeture EOF from inputStream
> > >     at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:102)
> > >     at
> > >
> >
> org.apache.hadoop.hdfs.DFSClient$BlockReader.readChunk(DFSClient.java:1207)
> > >     at
> > >
> >
> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
> > >     at
> org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:177)
> > >     at
> org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:194)
> > >     at
> org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159)
> > >     at
> > > org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1060)
> > >     at
> > >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1615)
> > >     at
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1665)
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:178)
> > >     at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.init(SequenceFile.java:1464)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1442)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1431)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1426)
> > >     at org.apache.hadoop.hbase.io.MapFile$Reader.open(MapFile.java:318)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.<init>(HBaseMapFile.java:78)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.io.BloomFilterMapFile$Reader.<init>(BloomFilterMapFile.java:68)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:443)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:127)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.<init>(StoreFileScanner.java:65)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HStoreScanner.<init>(HStoreScanner.java:92)
> > >     at
> > >
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2134)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$HScanner.<init>(HRegion.java:2000)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1187)
> > >     at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1714)
> > >     at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
> > >     at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
> > >     at
> > >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message