hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: RegionServer Aborting
Date Fri, 26 Mar 2010 17:18:29 GMT
Your region server log is missing the reason for the abort, but if you
had the following error in the DN log then it probably means that the
RS aborted because it wasn't able to write into HDFS. Since HBase
doesn't have any insight into why it's not able to contact a DN, it
prefers the paranoid way and shuts itself down.

If you search the mailing lists for that error, you will probably
stumble upon the following configuration:

<property>
  <name>dfs.datanode.socket.write.timeout</name>
  <value>0</value>
</property>

This is set in hdfs-site.xml, it's a config I personally use and I
never saw that problem on my clusters since.

Hope this helps,

J-D

2010/3/26  <y_823910@tsmc.com>:
> HDFS log
>
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/10.81.47.50:50010
> remote=/10.81.47.35:34325]
>      at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>      at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>      at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>      at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>      at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>      at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>      at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>      at java.lang.Thread.run(Thread.java:619)
>
> 2010-03-26 15:53:30,910 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(10.81.47.50:50010,
> storageID=DS-758373957-10.81.47.50-50010-1264018078483, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/10.81.47.50:50010
> remote=/10.81.47.35:34325]
>      at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>      at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>      at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>      at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
>      at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
>      at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
>      at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
>      at java.lang.Thread.run(Thread.java:619)
>
>
>
>
> Fleming Chiu(邱宏明)
> 707-6128
> y_823910@tsmc.com
> 週一無肉日吃素救地球(Meat Free Monday Taiwan)
>
>
>
>
>
>                      y_823910@tsmc.com
>                                               To:      hbase-user@hadoop.apache.org
>                      2010/03/26 05:06         cc:      (bcc: Y_823910/TSMC)
>                      PM                       Subject: RegionServer Aborting
>                      Please respond to
>                      hbase-user
>
>
>
>
>
>
> Hi,
>
> I didn't send any command to shutdown my region server,
> so I don't know why my region server shutdown automatically?
> Any ideas?
>
>
> HBase version : 0.20.2, r834515
>
> Hadoop version:  0.20.1, r810220
>
> 2010-03-26 15:56:59,330 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at:
> 10.81.47.50:60020
> 2010-03-26 15:57:01,797 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
> 2010-03-26 15:57:01,797 INFO org.apache.zookeeper.ZooKeeper: Closing
> session: 0x1279807e42c0003
> 2010-03-26 15:57:01,797 INFO org.apache.zookeeper.ClientCnxn: Closing
> ClientCnxn for session: 0x1279807e42c0003
> 2010-03-26 15:57:01,800 INFO org.apache.zookeeper.ClientCnxn: Exception
> while closing send thread for session 0x1279807e42c0003 : Read error rc =
> -1 java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
> 2010-03-26 15:57:01,915 INFO org.apache.zookeeper.ClientCnxn: Disconnecting
> ClientCnxn for session: 0x1279807e42c0003
> 2010-03-26 15:57:01,915 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x1279807e42c0003 closed
> 2010-03-26 15:57:01,915 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Closed connection with
> ZooKeeper
> 2010-03-26 15:57:01,915 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2010-03-26 15:57:02,024 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver/10.81.47.50:60020 exiting
> 2010-03-26 15:57:06,669 INFO org.apache.hadoop.hbase.Leases:
> regionserver/10.81.47.50:60020.leaseChecker closing leases
> 2010-03-26 15:57:06,669 INFO org.apache.hadoop.hbase.Leases:
> regionserver/10.81.47.50:60020.leaseChecker closed leases
> 2010-03-26 15:57:06,670 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown
> thread.
> 2010-03-26 15:57:06,670 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread
> complete
>
>
>
> Fleming Chiu(邱宏明)
> 707-6128
> y_823910@tsmc.com
> 週一無肉日吃素救地球(Meat Free Monday Taiwan)
>
>
>
> ---------------------------------------------------------------------------
>                                                         TSMC PROPERTY
>
>  This email communication (and any attachments) is proprietary information
>
>  for the sole use of its
>
>  intended recipient. Any unauthorized review, use or distribution by anyone
>
>  other than the intended
>
>  recipient is strictly prohibited.  If you are not the intended recipient,
>
>  please notify the sender by
>
>  replying to this email, and then delete this email and any copies of it
>
>  immediately. Thank you.
>
>
> ---------------------------------------------------------------------------
>
>
>
>
>
>
>
>
>  ---------------------------------------------------------------------------
>                                                         TSMC PROPERTY
>  This email communication (and any attachments) is proprietary information
>  for the sole use of its
>  intended recipient. Any unauthorized review, use or distribution by anyone
>  other than the intended
>  recipient is strictly prohibited.  If you are not the intended recipient,
>  please notify the sender by
>  replying to this email, and then delete this email and any copies of it
>  immediately. Thank you.
>  ---------------------------------------------------------------------------
>
>
>
>

Mime
View raw message