hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@maprtech.com>
Subject Re: DFS respond very slow
Date Tue, 16 Oct 2012 02:23:03 GMT
Uhhh... Alexey, did you really mean that you are running 100 mega bit per
second network links?

That is going to make hadoop run *really* slowly.

Also, putting RAID under any DFS, be it Hadoop or MapR is not a good recipe
for performance.  Not that it matters if you only have 10mega bytes per
second available from the network.

On Mon, Oct 15, 2012 at 6:56 PM, Andy Isaacson <adi@cloudera.com> wrote:

> Also, note that JVM startup overhead, etc, means your -ls time is not
> completely unreasonable. Using OpenJDK on a cluster of VMs, my "hdfs
> dfs -ls" takes 1.88 seconds according to time (and 1.59 seconds of
> user CPU time).
>
> I'd be much more concerned about your slow transfer times.  On the
> same cluster, I can easily push 4 MB/sec even with only a 100MB file
> using "hdfs dfs -put - foo.txt". And of course using distcp or
> multiple -put workloads HDFS can saturate multiple GigE links.
>
> -andy
>
> On Mon, Oct 15, 2012 at 5:22 PM, Vinod Kumar Vavilapalli
> <vinodkv@hortonworks.com> wrote:
> > Try picking up a single operation say "hadoop dfs -ls" and start
> profiling.
> >  - Time the client JVM is taking to start. Enable debug logging on the
> > client side by exporting HADOOP_ROOT_LOGGER=DEBUG,CONSOLE
> >  - Time between the client starting and the namenode audit logs showing
> the
> > read request. Also enable debug logging on the daemons too.
> >  - Also, you can wget the namenode web pages and see how fast they
> return.
> >
> > To repeat what is already obvious, It is most likely related to your
> network
> > setup and/or configuration.
> >
> > Thanks,
> > +Vinod
> >
> > On Oct 10, 2012, at 12:20 AM, Alexey wrote:
> >
> > ok, here you go:
> > I have 3 servers:
> > datanode on server 1, 2, 3
> > namenode on server 1
> > secondarynamenode on server 2
> >
> > all servers are at the hetzner datacenter and connected through 100Mbit
> > link, pings between them about 0.1ms
> >
> > each server has 24Gb ram and intel core i7 3Ghz CPU
> > disk is 700Gb RAID
> >
> > the bindings related configuration is the following:
> > server 1:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>
> >
> > <name>dfs.datanode.http.address</name>
> > <value>0.0.0.0:50075</value>
> >
> > <name>dfs.http.address</name>
> > <value>5.6.7.11:50070</value>
> >
> > <name>dfs.secondary.https.port</name>
> > <value>50490</value>
> >
> > <name>dfs.https.port</name>
> > <value>50470</value>
> >
> > <name>dfs.https.address</name>
> > <value>5.6.7.11:50470</value>
> >
> > <name>dfs.secondary.http.address</name>
> > <value>5.6.7.12:50090</value>
> > --------------------------------------
> >
> > server 2:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>
> >
> > <name>dfs.datanode.http.address</name>
> > <value>0.0.0.0:50075</value>
> >
> > <name>dfs.http.address</name>
> > <value>5.6.7.11:50070</value>
> >
> > <name>dfs.secondary.https.port</name>
> > <value>50490</value>
> >
> > <name>dfs.https.port</name>
> > <value>50470</value>
> >
> > <name>dfs.https.address</name>
> > <value>5.6.7.11:50470</value>
> >
> > <name>dfs.secondary.http.address</name>
> > <value>5.6.7.12:50090</value>
> > --------------------------------------
> >
> > server 3:
> > core-site.xml
> > --------------------------------------
> > <name>fs.default.name</name>
> > <value>hdfs://5.6.7.11:8020</value>
> > --------------------------------------
> >
> > hdfs-site.xml
> > --------------------------------------
> > <name>dfs.datanode.address</name>
> > <value>0.0.0.0:50010</value>
> >
> > <name>dfs.datanode.http.address</name>
> > <value>0.0.0.0:50075</value>
> >
> > <name>dfs.http.address</name>
> > <value>127.0.0.1:50070</value>
> >
> > <name>dfs.secondary.https.port</name>
> > <value>50490</value>
> >
> > <name>dfs.https.port</name>
> > <value>50470</value>
> >
> > <name>dfs.https.address</name>
> > <value>127.0.0.1:50470</value>
> >
> > <name>dfs.secondary.http.address</name>
> > <value>5.6.7.12:50090</value>
> > --------------------------------------
> >
> > netstat output:
> > server 1
> >
> > tcp        0      0 5.6.7.11:8020           0.0.0.0:*
> LISTEN
> > 10870/java
> >
> > tcp        0      0 5.6.7.11:50070          0.0.0.0:*
> LISTEN
> > 10870/java
> >
> > tcp        0      0 0.0.0.0:50010           0.0.0.0:*
> LISTEN
> > 10997/java
> >
> > tcp        0      0 0.0.0.0:50075           0.0.0.0:*
> LISTEN
> > 10997/java
> >
> > tcp        0      0 0.0.0.0:50020           0.0.0.0:*
> LISTEN
> > 10997/java
> >
> >
> > server 2
> >
> > tcp        0      0 0.0.0.0:50010           0.0.0.0:*
> LISTEN
> > 23683/java
> >
> > tcp        0      0 0.0.0.0:50075           0.0.0.0:*
> LISTEN
> > 23683/java
> >
> > tcp        0      0 0.0.0.0:50020           0.0.0.0:*
> LISTEN
> > 23683/java
> >
> > tcp        0      0 5.6.7.12:50090          0.0.0.0:*
> LISTEN
> > 23778/java
> >
> >
> > server 3
> >
> > tcp        0      0 0.0.0.0:50010           0.0.0.0:*
> LISTEN
> > 894/java
> >
> > tcp        0      0 0.0.0.0:50075           0.0.0.0:*
> LISTEN
> > 894/java
> >
> > tcp        0      0 0.0.0.0:50020           0.0.0.0:*
> LISTEN
> > 894/java
> >
> >
> > if I'm transferring big files between servers I'm getting about 9Mb/s
> > and even 10Mb/s with rsync
> >
> > On 10/09/12 11:56 PM, Harsh J wrote:
> >
> > Hi,
> >
> >
> > OK, can you detail your network infrastructure used here, and also
> >
> > make sure your daemons are binding to the right interfaces as well
> >
> > (use netstat to check perhaps)? What rate of transfer do you get for
> >
> > simple file transfers (ftp, scp, etc.)?
> >
> >
> > On Wed, Oct 10, 2012 at 12:24 PM, Alexey <alexxoid@gmail.com> wrote:
> >
> > Hello Harsh,
> >
> >
> > I notices such issues from the start.
> >
> > Yes, I mean dfs.balance.bandwidthPerSec property, I set this property to
> >
> > 5000000.
> >
> >
> > On 10/09/12 11:50 PM, Harsh J wrote:
> >
> > Hey Alexey,
> >
> >
> > Have you noticed this right from the start itself? Also, what exactly
> >
> > do you mean by "Limited replication bandwidth between datanodes -
> >
> > 5Mb." - Are you talking of dfs.balance.bandwidthPerSec property?
> >
> >
> > On Wed, Oct 10, 2012 at 10:53 AM, Alexey <alexxoid@gmail.com> wrote:
> >
> > Additional info: I also tried to use openjdk instead of sun's - issue
> >
> > still persists
> >
> >
> > On 10/09/12 03:12 AM, Alexey wrote:
> >
> > Hi,
> >
> >
> > I have an issues with hadoop dfs, I have 3 servers (24Gb RAM on each).
> >
> > The servers are not overloaded, they just have hadoop installed. One
> >
> > have datanode and namenode, second - datanode only, third - datanode and
> >
> > secondarynamenode.
> >
> >
> > Hadoop datanodes have a max memory limit 8Gb. Default replication factor
> >
> > - 2. Limited replication bandwidth between datanodes - 5Mb.
> >
> >
> > I've setupped hadoop to communicate between nodes by IP address.
> >
> > Everything is works - I can read/write files on each datanode, etc. But
> >
> > the issue is that hadoop dfs commands are executing very slow, even
> >
> > "hadoop dfs -ls /" takes about 3 seconds to execute, but it have only
> >
> > one folder /user in it.
> >
> > Files are also uploading to the hdfs very slow - hundreds
> kilobytes/second.
> >
> >
> > I'm using Debian stable x86-64 distribution and hadoop running through
> >
> > sun-java6-jdk 6.26-0squeeze1
> >
> >
> > Please give me any suggestions what I need to adjust/check to arrange
> >
> > this issue.
> >
> >
> > As I said before - overall hdfs configuration is correct, because
> >
> > everything works except performance.
> >
> >
> > --
> >
> > Best regards
> >
> > Alexey
> >
> >
> >
> > --
> >
> > Best regards
> >
> > Alexey
> >
> >
> >
> >
> >
> > --
> >
> > Best regards
> >
> > Alexey
> >
> >
> >
> >
> >
> > --
> > Best regards
> > Alexey
> >
> >
>

Mime
View raw message