hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6133) ReflectionUtils performance regression
Date Wed, 08 Jul 2009 22:13:14 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon updated HADOOP-6133:

    Attachment: hadoop-6133-0.20.patch

Here is one possible patch to fix this issue. The benchmark results in:

ReflectionUtils on post-patch branch-0.20: ~18.1sec

(still slower than 0.18.3 by about 2.5x but at least tolerable)

This is not the most elegant fix, but the ClassLoader inside Configuration makes it slightly
difficult to do this at a different layer.

As for the importance of this - despite advice not to use ReflectionUtils in any hot path,
there are cases when this happens. For example, MapWritable and GenericWritable do so for
every deserialization. Outside libraries like Cascading also seem to not reuse objects in
WritableDeserialization, and we have reports of some jobs spending nearly 100% CPU in Class.forName
when profiled.

This patch is against branch-0.20 but should also work post-split after the file rename.

> ReflectionUtils performance regression
> --------------------------------------
>                 Key: HADOOP-6133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6133
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.20.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-6133-0.20.patch, Test.java
> HADOOP-4187 introduced extra calls to Class.forName in ReflectionUtils.setConf. This
caused a fairly large performance regression. Attached is a microbenchmark that shows the
following timings (ms) for 100M constructions of new instances:
> Explicit construction (new Test): around ~1.6sec
> Using Test.class.newInstance: around ~2.6sec
> ReflectionUtils on 0.18.3: ~8.0sec
> ReflectionUtils on 0.20.0: ~200sec
> This illustrates the ~80x slowdown caused by HADOOP-4187.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message