hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel de Ferran <manuel.defer...@gmail.com>
Subject Re: Could not get additional block while writing hundreds of files
Date Thu, 04 Jul 2013 08:46:56 GMT
Hye Azuryy,

During the import, dfsadmin -report :

DFS Used%: 17.72%

Moreover, it succeeds from time to time w/ the same data load. It seems
that Datanode appears to be down to the Namenode, but why ?



On Thu, Jul 4, 2013 at 3:31 AM, Azuryy Yu <azuryyyu@gmail.com> wrote:

> Hi Manuel,
>
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
>
>
> This indicates you haven't enough space on the HDFS. can you check the
> cluster capacity used?
>
>
>
>
> On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <
> manuel.deferran@gmail.com> wrote:
>
>> Greetings all,
>>
>> we try to import data to an HDFS cluster, but we face random Exception.
>> We try to figure out what is the root cause: misconfiguration, too much
>> load, ... and how to solve that.
>>
>> The client writes hundred of files with a replication factor of 3. It
>> crashes sometimes at the beginning, sometimes close to the end, and in rare
>> case it succeeds.
>>
>> On failure, we have on client side:
>>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
>> java.io.IOException: File /log/1372863795616 could only be replicated to 0
>> nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>          ....
>>
>> which seems to be well known. We have followed the hints from the
>> Troubleshooting page, but we're still stuck: lots of disk available on
>> datanodes, free inodes, far below the open files limit , all datanodes are
>> up and running.
>>
>> Note that we have other HDFS clients that are still able to write files
>> while import is running.
>>
>> Here is the corresponding extract of the namenode log file:
>>
>> 2013-07-03 15:03:15,951 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
>> transactions: 46009 Total time for transactions(ms): 153Number of
>> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
>> 139555
>> 2013-07-03 15:03:16,427 WARN
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
>> enough replicas, still in need of 3
>> 2013-07-03 15:03:16,427 ERROR
>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> as:root cause:java.io.IOException: File /log/1372863795616 could only be
>> replicated to 0 nodes, instead of 1
>> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
>> null) from 192.168.1.141:41376: error: java.io.IOException: File
>> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
>> java.io.IOException: File /log/1372863795616 could only be replicated to
>> 0 nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>
>>
>> During the process, fsck reports about 300 of open files. The cluster is
>> running hadoop-1.0.3.
>>
>> Any advice about the configuration ? We tried to
>> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
>> maybe raising dfs.datanode.handler.count ?
>>
>>
>> Thanks for your help
>>
>
>

Mime
View raw message