hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3860) Compare name-node performance when journaling is performed into local hard-drives or nfs.
Date Wed, 30 Jul 2008 02:01:31 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618038#action_12618038
] 

Konstantin Shvachko commented on HADOOP-3860:
---------------------------------------------

I benchmarked three operations: _create_, _rename_, and _delete_ using {{NNThroughputBenchmark}},
which is a pure name-node benchmark. It calls the name-node methods directly without using
the rpc protocol. So the *rpc overhead is not included* in these results, and should be measured
separately say with synthetic load generator. 
In a sense these benchmarks determine an upper bound for the HDFS operations, namely the maximum
throughput the name-node can sustain under heavy load.

Each run starts with an empty files system and performs 1 million operations handled by 256
threads on the name-node. The output is the throughput that is the number of operation per
second, which is calculated as 1,000,000/(tE-tE), where tB is when the first thread starts,
and tE is when all threads stop. The threads run in parallel.
Creates create empty files and do not close them. Renames change file names, but do not move
them.
All test results are consistent except for one distortion in deletes on a remote drive, which
is way out of the expected range. Don't know what that is, one day they were good the other
not.

Each test consists of 1,000,000 operations performed using 256 threads.
Result is in *ops/sec*.
||Log to	||open	||create (no close)	||rename	||delete||
|none		| 126,119| | | |
|1 Local HD	| |5,710	|8,400	|20,690|
|1 NFS HD	| |5,600	|8,290	|12,090|
|1 NFS Filer	| |5,676	|8,134	|21,100|
|4 Local HD	| |5,210| | |
|3 loc HD, 1 NFS HD	| |5,150| | |

Some conclusions:
-	Local drive is faster than nfs, and
-	nfs filer is faster than a remote drive;
-	but *the difference between nfs storage and local drives is very slim, only 2-3%*.
-	*Using 4 local drives instead of 1 degrades the performance by only 9%*, even though we
write onto the drives sequentially (one after another).
_It would be fair to say that there is some parallelism in writing, since current code batches
writes first and then synchs them at once in larges chunks. So while the writes are sequential
the synchs are parallel._
-	Opens (getBlockLocation()) are 22 times faster than creates,
-	which means *journaling is the real bottleneck* for the name-node operations,
-	and the *lack of fine-grained locking in the namespace data-structures is not a problem*
so far. Otherwise, the throughputs for opens and other operations would be characterized by
the same or at least close numbers.
-	Further optimization of the name-node performance imo should be focused around *efficient
journaling*.

Another set of statistical data, which characterizes the actual load on the name-node on some
of our clusters. Unfortunately, the statistics for open is broken, and we do not collect stats
for renames. So I can only present creates and deletes. Please contribute if somebody has
more data.

||Actual load (ops/sec)||open	||create	||delete||
|peak	| |144	|6460|
|avarage| |11	|50|

-	These numbers show that the actual peak load for creates is about 40 times lower than the
name-node can handle, and 3 times lower for deletes. On average the picture is even more drastic.

*The name-node processing capability is 400-500 times higher than the actual average load
on it.*


> Compare name-node performance when journaling is performed into local hard-drives or
nfs.
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3860
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3860
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>             Fix For: 0.19.0
>
>         Attachments: NNThruputMoreOps.patch
>
>
> The goal of this issue is to measure how the name-node performance depends on where the
edits log is written to.
> Three types of the journal storage should be evaluated:
> # local hard drive;
> # remote drive mounted via nfs;
> # nfs filer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message