hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: LeaseExpiredException and too many xceiver
Date Fri, 31 Oct 2008 22:28:49 GMT

Config on most Y! clusters sets dfs.datanode.max.xcievers to a large 
value .. something like 1k to 2k. You could try that.

Raghu.

Nathan Marz wrote:
> Looks like the exception on the datanode got truncated a little bit. 
> Here's the full exception:
> 
> 2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: 
> DatanodeRegistration(10.100.11.115:50010,
> storageID=DS-2129547091-10.100.11.115-50010-1225485937590, 
> infoPort=50075, ipcPort=50020):DataXceiver: java.io.IOException:
> xceiverCount 257 exceeds the limit of concurrent xcievers 256
>         at 
> org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1030)
>         at java.lang.Thread.run(Thread.java:619)
> 
> 
> On Oct 31, 2008, at 2:49 PM, Nathan Marz wrote:
> 
>> Hello,
>>
>> We are seeing some really bad errors on our hadoop cluster. After 
>> reformatting the whole cluster, the first job we run immediately fails 
>> with "Could not find block locations..." errrors. In the namenode 
>> logs, we see a ton of errors like:
>>
>> 2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC Server 
>> handler 5 on 7276, call addBlock(/tmp/dustintmp/shredded_dataunits/_t$
>> org.apache.hadoop.dfs.LeaseExpiredException: No lease on 
>> /tmp/dustintmp/shredded_dataunits/_temporary/_attempt_200810311418_0002_m_000023_0$

>>
>>        at 
>> org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166)
>>        at 
>> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1097) 
>>
>>        at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>>        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>        at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

>>
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>>
>>
>>
>> In the datanode logs, we see a ton of errors like:
>>
>> 2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: 
>> DatanodeRegistration(10.100.11.115:50010, 
>> storageID=DS-2129547091-10.100.11.1$
>> of concurrent xcievers 256
>>        at 
>> org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1030)
>>        at java.lang.Thread.run(Thread.java:619)
>>
>>
>>
>> Anyone have any ideas on what may be wrong?
>>
>> Thanks,
>> Nathan Marz
>> Rapleaf
> 


Mime
View raw message