hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kumar Vavilapalli <vino...@hortonworks.com>
Subject Re: DFS respond very slow
Date Tue, 16 Oct 2012 00:22:34 GMT
Try picking up a single operation say "hadoop dfs -ls" and start profiling.
 - Time the client JVM is taking to start. Enable debug logging on the client side by exporting
HADOOP_ROOT_LOGGER=DEBUG,CONSOLE
 - Time between the client starting and the namenode audit logs showing the read request.
Also enable debug logging on the daemons too.
 - Also, you can wget the namenode web pages and see how fast they return.

To repeat what is already obvious, It is most likely related to your network setup and/or
configuration.

Thanks,
+Vinod

On Oct 10, 2012, at 12:20 AM, Alexey wrote:

> ok, here you go:
> I have 3 servers:
> datanode on server 1, 2, 3
> namenode on server 1
> secondarynamenode on server 2
> 
> all servers are at the hetzner datacenter and connected through 100Mbit
> link, pings between them about 0.1ms
> 
> each server has 24Gb ram and intel core i7 3Ghz CPU
> disk is 700Gb RAID
> 
> the bindings related configuration is the following:
> server 1:
> core-site.xml
> --------------------------------------
> <name>fs.default.name</name>
> <value>hdfs://5.6.7.11:8020</value>
> --------------------------------------
> 
> hdfs-site.xml
> --------------------------------------
> <name>dfs.datanode.address</name>
> <value>0.0.0.0:50010</value>
> 
> <name>dfs.datanode.http.address</name>
> <value>0.0.0.0:50075</value>
> 
> <name>dfs.http.address</name>
> <value>5.6.7.11:50070</value>
> 
> <name>dfs.secondary.https.port</name>
> <value>50490</value>
> 
> <name>dfs.https.port</name>
> <value>50470</value>
> 
> <name>dfs.https.address</name>
> <value>5.6.7.11:50470</value>
> 
> <name>dfs.secondary.http.address</name>
> <value>5.6.7.12:50090</value>
> --------------------------------------
> 
> server 2:
> core-site.xml
> --------------------------------------
> <name>fs.default.name</name>
> <value>hdfs://5.6.7.11:8020</value>
> --------------------------------------
> 
> hdfs-site.xml
> --------------------------------------
> <name>dfs.datanode.address</name>
> <value>0.0.0.0:50010</value>
> 
> <name>dfs.datanode.http.address</name>
> <value>0.0.0.0:50075</value>
> 
> <name>dfs.http.address</name>
> <value>5.6.7.11:50070</value>
> 
> <name>dfs.secondary.https.port</name>
> <value>50490</value>
> 
> <name>dfs.https.port</name>
> <value>50470</value>
> 
> <name>dfs.https.address</name>
> <value>5.6.7.11:50470</value>
> 
> <name>dfs.secondary.http.address</name>
> <value>5.6.7.12:50090</value>
> --------------------------------------
> 
> server 3:
> core-site.xml
> --------------------------------------
> <name>fs.default.name</name>
> <value>hdfs://5.6.7.11:8020</value>
> --------------------------------------
> 
> hdfs-site.xml
> --------------------------------------
> <name>dfs.datanode.address</name>
> <value>0.0.0.0:50010</value>
> 
> <name>dfs.datanode.http.address</name>
> <value>0.0.0.0:50075</value>
> 
> <name>dfs.http.address</name>
> <value>127.0.0.1:50070</value>
> 
> <name>dfs.secondary.https.port</name>
> <value>50490</value>
> 
> <name>dfs.https.port</name>
> <value>50470</value>
> 
> <name>dfs.https.address</name>
> <value>127.0.0.1:50470</value>
> 
> <name>dfs.secondary.http.address</name>
> <value>5.6.7.12:50090</value>
> --------------------------------------
> 
> netstat output:
> server 1
>> tcp        0      0 5.6.7.11:8020           0.0.0.0:*               LISTEN      10870/java
>> tcp        0      0 5.6.7.11:50070          0.0.0.0:*               LISTEN      10870/java
>> tcp        0      0 0.0.0.0:50010           0.0.0.0:*               LISTEN      10997/java
>> tcp        0      0 0.0.0.0:50075           0.0.0.0:*               LISTEN      10997/java
>> tcp        0      0 0.0.0.0:50020           0.0.0.0:*               LISTEN      10997/java
> 
> server 2
>> tcp        0      0 0.0.0.0:50010           0.0.0.0:*               LISTEN      23683/java
>> tcp        0      0 0.0.0.0:50075           0.0.0.0:*               LISTEN      23683/java
>> tcp        0      0 0.0.0.0:50020           0.0.0.0:*               LISTEN      23683/java
>> tcp        0      0 5.6.7.12:50090          0.0.0.0:*               LISTEN      23778/java
> 
> server 3
>> tcp        0      0 0.0.0.0:50010           0.0.0.0:*               LISTEN      894/java
>> tcp        0      0 0.0.0.0:50075           0.0.0.0:*               LISTEN      894/java
>> tcp        0      0 0.0.0.0:50020           0.0.0.0:*               LISTEN      894/java
> 
> if I'm transferring big files between servers I'm getting about 9Mb/s
> and even 10Mb/s with rsync
> 
> On 10/09/12 11:56 PM, Harsh J wrote:
>> Hi,
>> 
>> OK, can you detail your network infrastructure used here, and also
>> make sure your daemons are binding to the right interfaces as well
>> (use netstat to check perhaps)? What rate of transfer do you get for
>> simple file transfers (ftp, scp, etc.)?
>> 
>> On Wed, Oct 10, 2012 at 12:24 PM, Alexey <alexxoid@gmail.com> wrote:
>>> Hello Harsh,
>>> 
>>> I notices such issues from the start.
>>> Yes, I mean dfs.balance.bandwidthPerSec property, I set this property to
>>> 5000000.
>>> 
>>> On 10/09/12 11:50 PM, Harsh J wrote:
>>>> Hey Alexey,
>>>> 
>>>> Have you noticed this right from the start itself? Also, what exactly
>>>> do you mean by "Limited replication bandwidth between datanodes -
>>>> 5Mb." - Are you talking of dfs.balance.bandwidthPerSec property?
>>>> 
>>>> On Wed, Oct 10, 2012 at 10:53 AM, Alexey <alexxoid@gmail.com> wrote:
>>>>> Additional info: I also tried to use openjdk instead of sun's - issue
>>>>> still persists
>>>>> 
>>>>> On 10/09/12 03:12 AM, Alexey wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I have an issues with hadoop dfs, I have 3 servers (24Gb RAM on each).
>>>>>> The servers are not overloaded, they just have hadoop installed.
One
>>>>>> have datanode and namenode, second - datanode only, third - datanode
and
>>>>>> secondarynamenode.
>>>>>> 
>>>>>> Hadoop datanodes have a max memory limit 8Gb. Default replication
factor
>>>>>> - 2. Limited replication bandwidth between datanodes - 5Mb.
>>>>>> 
>>>>>> I've setupped hadoop to communicate between nodes by IP address.
>>>>>> Everything is works - I can read/write files on each datanode, etc.
But
>>>>>> the issue is that hadoop dfs commands are executing very slow, even
>>>>>> "hadoop dfs -ls /" takes about 3 seconds to execute, but it have
only
>>>>>> one folder /user in it.
>>>>>> Files are also uploading to the hdfs very slow - hundreds kilobytes/second.
>>>>>> 
>>>>>> I'm using Debian stable x86-64 distribution and hadoop running through
>>>>>> sun-java6-jdk 6.26-0squeeze1
>>>>>> 
>>>>>> Please give me any suggestions what I need to adjust/check to arrange
>>>>>> this issue.
>>>>>> 
>>>>>> As I said before - overall hdfs configuration is correct, because
>>>>>> everything works except performance.
>>>>>> 
>>>>>> --
>>>>>> Best regards
>>>>>> Alexey
>>>>>> 
>>>>> 
>>>>> --
>>>>> Best regards
>>>>> Alexey
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Best regards
>>> Alexey
>> 
>> 
>> 
> 
> -- 
> Best regards
> Alexey


Mime
View raw message