hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: Why do non data nodes need rack awareness?
Date Fri, 03 Jun 2016 20:35:25 GMT
Hello Colin,

Judging from the stack trace, I think you've hit a known HDFS bug:
HDFS-8055.  A fix for this bug has been committed for the upcoming Apache
Hadoop 2.8.0 release.


--Chris Nauroth

On 6/3/16, 1:21 PM, "Colin Kincaid Williams" <discord@uw.edu> wrote:

>Thanks for your insight Vinay:
>It makes sense using it now, I appreciate the ability to select which
>rack or round-robin. However I think the client api behavior might
>have changed, because our first rack awareness script from early
>hadoop 2.0.0 didn't provide a default ip, but I don't recall these
>With respect to my current issue: We had noticed that we could not
>hdfs dfs -cat any files from our namenode itself. But we had made our
>rack awareness script present it's caller arguments by echo $1 >>
>/tmp/foo. I didn't find the IP for the namenode, or loopback
>interface. Then it didn't appear to be requesting rack information for
>the namenode. However, after adding the default rack to the script;
>the issue went away. But the rack awareness didn't enter the namenode
>IP into the file, why did we see the following behavior from the
>namenode itself?
>sudo -u hdfs hdfs dfs -cat
>cat: java.lang.NullPointerException
>On Fri, Jun 3, 2016 at 1:14 AM, Vinayakumar B <vinayakumarb@apache.org>
>> Rack awareness feature introduced to place the data blocks distributed
>> multiple racks, to avoid the data loss in case of whole rack failure.
>> Now while reading/writing data blocks, to find the closest, data
>> w.r.t to client will be considered. To know the nearest datanode in
>>terms of
>> rack mapping for the client, client's rack details arts required.  So
>> why if there are no datanodes also client's rack mapping will be
>>resolved by
>> namenode. By giving the correct real details, local rack datanode will
>> chosen for read improving the performance.
>> In case default rack is given for non-datanode ip, any random datanode
>> be chosen to read the data.
>> Hope this helps,
>> Cheers,
>> -Vinay
>> On 3 Jun 2016 03:37, "Colin Kincaid Williams" <discord@uw.edu> wrote:
>> Recently we had a namenode that had a failed edits directory, and
>> there was a failover. Things appeared to be functioning properly at
>> first, but later we had hdfs issues.
>> Looking at the namenode logs, we saw
>> 2016-06-01 20:38:18,771 ERROR
>> org.apache.hadoop.net.ScriptBasedMapping: Script
>> /etc/hadoop/conf/getRackID.sh returned 0 values when 1 were expected.
>> 2016-06-01 20:38:18,771 WARN org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 8020, call
>> org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
>> Call#484441029 Retry#0
>> java.lang.NullPointerException
>>   at
>>   at
>>   at
>>   at
>>   at
>>   at
>>   at
>>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at javax.security.auth.Subject.doAs(Subject.java:415)
>>   at
>>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
>> So we could see that our rack awareness script was not returning a
>> value. Then we made changes to the script to return the callers
>> arguments for the script. We found a list of IPs, some which run
>> services like oozie, some IPs our gateway server. However none of
>> these IPs are the datanodes themselves.
>> The symptoms of this issue were that the namenode itself couldn't cat
>> files on the system, or make requests to move files on the history
>> server, etc.
>> From my understanding about rack awareness, we just need to provide a
>> rack id for hosts that are datanodes. However all are datanodes were
>> listed, and the requested ips were from non-datanodes.
>> The solution was to provide a default ip for missing IPs in the rack
>> awareness script. This is not well understood from the rack awareness
>> docs, and caused a DOS on our hadoop services.
>> But I want to know why  the rack awareness script is getting called
>> with IPs of non datanodes from our hadoop namenode. Is this a design
>> feature of the yarn libraries? Why do non data node IPs need a rack
>> id?
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: user-help@hadoop.apache.org
>To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>For additional commands, e-mail: user-help@hadoop.apache.org

To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org

View raw message