hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Staffan Friberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
Date Thu, 05 Nov 2015 00:06:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990758#comment-14990758
] 

Staffan Friberg commented on HDFS-9260:
---------------------------------------

Hi Daryn,

Thanks for the comments and the additional data points. Interesting to learn more about the
scale of HDFS instances. I wonder if the NN was running on older and slower hardware in my
case compared to your setup, the cluster I was able to get my hands on for these runs has
fairly old machines.

Adds of new blocks are relatively fast since they will be at the far right of the Tree the
number of lookups will be minimal. However the current implementation only needs to do around
two writes to insert something at the head/end of the list nothing that has a more complicated
datastructure will be able to match it. It will be a question of trade-off.

Also to clarify, the microbenchmarks only measures the actual remove and insert of random
values not the whole process of copying files etc. I would expect the other parts to far outweigh
the time it takes to update the datastructures, so while the 4x sounds scary it should be
a minor part of the whole transaction.

I think the patch you are referring to is HDFS-6658. I applied it to the 3.0.0 branch from
March 11 2015 which was from when the patch was created and ran it on the same microbenchmarks
I built to test my patch. I will attach the source code for the benchmarks so you can check
that I used the right APIs for it to be comparable. From what I can tell the benchmarks should
do the same thing on a high level. The performance overhead for adding and removing are similar
between our two implementations. 

{noformat}
fbrAllExisting  - Do a Full Block Report with the same 2M entries that are already registered
for the Storage in the NN.
addRemoveBulk   - Remove 32k random blocks from a StorageInfo that has 64k entries, then re-add
them all.
addRemoveRandom - Remove and directly re-add a block from a Storage entry, repeat for 32k
blocks from a StorageInfo with 64k blocks
iterate         - Iterate and get blockID for 64k blocks associated with a particular StorageInfo

==> benchmarks_trunkMarch11_intMapping.jar.output <==
Benchmark                          Mode  Cnt    Score   Error  Units
FullBlockReport.fbrAllExisting     avgt   25  379.659 ± 5.463  ms/op
StorageInfoAccess.addRemoveBulk    avgt   25   16.426 ± 0.380  ms/op
StorageInfoAccess.addRemoveRandom  avgt   25   15.401 ± 0.196  ms/op
StorageInfoAccess.iterate          avgt   25    1.496 ± 0.004  ms/op

==> benchmarks_trunk_baseline.jar.output <==
Benchmark                          Mode  Cnt    Score   Error  Units
FullBlockReport.fbrAllExisting     avgt   25  288.974 ± 3.970  ms/op
StorageInfoAccess.addRemoveBulk    avgt   25    3.157 ± 0.046  ms/op
StorageInfoAccess.addRemoveRandom  avgt   25    2.815 ± 0.012  ms/op
StorageInfoAccess.iterate          avgt   25    0.788 ± 0.006  ms/op

==> benchmarks_trunk_treeset.jar.output <==
Benchmark                          Mode  Cnt    Score   Error  Units
FullBlockReport.fbrAllExisting     avgt   25  231.270 ± 3.450  ms/op
StorageInfoAccess.addRemoveBulk    avgt   25   11.596 ± 0.521  ms/op
StorageInfoAccess.addRemoveRandom  avgt   25   11.249 ± 0.101  ms/op
StorageInfoAccess.iterate          avgt   25    0.385 ± 0.010  ms/op
{noformat}

Do you have a good suggestion for some other perf test/stress test that would be good to try
out? Any stress load you have on your end that would be possible to try it out on?

> Improve performance and GC friendliness of startup and FBRs
> -----------------------------------------------------------
>
>                 Key: HDFS-9260
>                 URL: https://issues.apache.org/jira/browse/HDFS-9260
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, namenode, performance
>    Affects Versions: 2.7.1
>            Reporter: Staffan Friberg
>            Assignee: Staffan Friberg
>         Attachments: FBR processing.png, HDFS Block and Replica Management 20151013.pdf,
HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, HDFS-7435.004.patch, HDFS-7435.005.patch,
HDFS-7435.006.patch, HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted.
This allows faster and more GC friendly handling of full block reports.
> Would like to hear peoples feedback on this change and also some help investigating/understanding
a few outstanding issues if we are interested in moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message