hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: timing out for hdfs errors faster
Date Thu, 07 Apr 2011 17:14:48 GMT
> Another question, why would dfsclient setting for sockettimeout (for
> data reading) would be set so high by default if HBASE is expected to
> be real time?  Shouldn't it be few seconds (5?).

Not all clusters are used for real time applications, also usually
users first try to cram as much data as they can and see if it holds,
disregard their hardware, if they are swapping, or anything that might
make things slow. A lot of configurations are set to high values for
those reasons.

>> 2011-04-07 07:49:41,527 WARN org.apache.hadoop.hdfs.DFSClient: Failed
>> to connect to /10.103.7.5:50010 for file
>> /hbase/media_data/1c95bfcf0dd19800b1f44278627259ae/att/7725092577730365184
>> for block 802538788372768807:java.net.SocketTimeoutException: 60000
>> millis timeout while waiting for channel to be ready for read. ch :
>> java.nio.channels.SocketChannel[connected local=/10.101.6.8:40801
>> remote=/10.103.7.5:50010
>>
>> What would be configuration setting to shorten the timeout say to 5
>> seconds?  What about retries (if any).

Something is up with that Datanode as the region server isn't even
able to establish a channel to it. The retries are done with other
replicas (no need to hit the same faulty datanode twice). Looking at
the code, the timeout for reads is set with dfs.socket.timeout

J-D

Mime
View raw message