hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaoming Shi (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-2343) repeated replace() function calls damage the performance
Date Wed, 23 Feb 2011 20:14:43 GMT
 repeated replace() function calls damage the performance
---------------------------------------------------------

                 Key: MAPREDUCE-2343
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2343
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tools/rumen
    Affects Versions: 0.21.0
            Reporter: Xiaoming Shi


In the file       
{noformat}
.hadoop-0.21.0/mapred/src/tools/org/apache/hadoop/tools/rumen/LoggedTaskAttempt.java     
          line:362   

hadoop-0.21.0/mapred/src/tools/org/apache/hadoop/tools/rumen/LoggedTask.java             
                 line:249
{noformat}

consecutive replace() is called to remove the special characters.  It's 5+ times slower than
using a for loop replace them all.
{noformat}
e.g.
 - str.replace('a', '|');
 - str.replace('b', '|');

 + StringBuilder sb = new StringBuilder( str.length() );
 + for (int i=0; i < str.length(); i++)
 +  {
 +           char c = str.charAt(i);
 +         if ( c == 'a'  || c =='b' )
 +               sb.append('|');
 +       else
 +                sb.append('c');
 +  }
 +  str  = sb.toString();
{noformat}
This bug has the same problem as the MySQL bug : http://bugs.mysql.com/bug.php?id=45699

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message