hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mingleizhang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
Date Wed, 06 Jul 2016 09:16:11 GMT
mingleizhang created MAPREDUCE-6729:

             Summary: Hitting performance and error when lots of files to write or read
                 Key: MAPREDUCE-6729
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: benchmarks, performance, test
            Reporter: mingleizhang
            Priority: Minor

When doing DFSIO test as distributed i/o benchmark tool. Then especially writes plenty of
files to disk or read from, both can cause performance issue and imprecise value in a way.
The question is that existing practices needs to delete files when before running a job and
that will cause time consumption and furthermore cause performance issue, statistical time
error and imprecise throughput for us. We need to replace or improve this hack to prevent
this from happening in the future.

public static void testWrite() throws Exception {
    FileSystem fs = cluster.getFileSystem();
    long tStart = System.currentTimeMillis();
    long execTime = System.currentTimeMillis() - tStart;
    bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);

private void writeTest(FileSystem fs) throws IOException {
  Path writeDir = getWriteDir(config);
  fs.delete(getDataDir(config), true);
  fs.delete(writeDir, true);    
  runIOTest(WriteMapper.class, writeDir);


This message was sent by Atlassian JIRA

To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org

View raw message