hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Machemer (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-10502) Enabled memory locking and now HDFS won't start up
Date Wed, 08 Jun 2016 11:45:21 GMT
Chris Machemer created HDFS-10502:

             Summary: Enabled memory locking and now HDFS won't start up
                 Key: HDFS-10502
                 URL: https://issues.apache.org/jira/browse/HDFS-10502
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: fs
    Affects Versions: 2.7.2
         Environment: RHEL 6.8
            Reporter: Chris Machemer

My goal is to speed up reads.  I have about 500k small files (2k to 15k) and I'm trying to
use HDFS as a cache for serialized instances of java objects.

I've written the code to construct and serialize all the objects out to HDFS, and am now hoping
to improve read performance, because accessing the objects from disk-based storage is proving
to be too slow for my application's SLA's.

So my first question is, is using memory locking and hdfs cacheadmin pools and directives
the right way to go, to cache my objects into memory, or should I create RAM disks, and do
memory-based storage instead?

If hdfs cacheadmin is the way to go (it's the path I'm going down so far), then I need to
figure out if what's happening is a bug or if I've configured something wrong, because when
I start up HDFS with a gig of memory locked (both in limits.d for ulimit -l and also in hdfs-site.xml)
and the server starts up, and presumably tries to cache things into memory, I get hours and
hours of timeouts in the logs like this:

2016-06-08 07:42:50,856 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: IOException
in offerService
java.net.SocketTimeoutException: Call From stgb-fe1.litle.com/ to localhost:8020
failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout
while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected
local=/ remote=localhost/]; For more details see:  http://wiki.apache.org/hadoop/SocketTimeout
	at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751)
	at org.apache.hadoop.ipc.Client.call(Client.java:1479)
	at org.apache.hadoop.ipc.Client.call(Client.java:1412)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
	at com.sun.proxy.$Proxy13.sendHeartbeat(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:153)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:554)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:653)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:824)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel
to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:520)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
	at java.io.DataInputStream.readInt(DataInputStream.java:387)
	at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1084)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:979)

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

View raw message