hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1028) INode.getPathNames could split more efficiently
Date Wed, 05 May 2010 22:48:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12864568#action_12864568
] 

Todd Lipcon commented on HDFS-1028:
-----------------------------------

+1 to the body of the patch. Regarding the test change, how long does the test case take now?
Can we make a better benchmark that is independent of the unit tests (I assume this change
was to show a speed improvement)? I don't think it makes sense to overload the unit tests
for the purposes of benchmarking.

Personally I'd be satisfied to just have simple timings of loading one of your production
fsimages with/without the change.

> INode.getPathNames could split more efficiently
> -----------------------------------------------
>
>                 Key: HDFS-1028
>                 URL: https://issues.apache.org/jira/browse/HDFS-1028
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Todd Lipcon
>            Assignee: Dmytro Molkov
>            Priority: Minor
>         Attachments: HDFS-split.patch
>
>
> INode.getPathnames uses String.split(String) which actually uses the full Java regex
implementation. Since we're always splitting on a single char, we could implement a faster
one like StringUtils.split() (except without the escape character). This takes a significant
amount of CPU during FSImage loading so should be a worthwhile speedup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message