hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2000) Re-write NNBench to use MapReduce
Date Fri, 02 Nov 2007 11:09:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539565

Konstantin Shvachko commented on HADOOP-2000:

# redundant imports
import java.text.DateFormat;
import org.apache.hadoop.mapred.Reducer;
# variable name in NNBenchMapper.map() is never used.
# Typo
    // Set user-dfined parameters,
# Printing TPS calculating TPmS. Should be the same:
    "       RAW DATA: TPS Total : " + totalTimeTPmS,
# double totalTimeTPS is confusing, since it is in fact TPS, not time according to the formula
and the comments
# I am not happy with the whole concept of transactions per second.
So you measure total execution time of each map (t_i) and then divide Number_of_files / Sum(t_i).
But the Sum(t_i) is not the right time, because maps are running in parallel,
so in order to obtain the true TPS you need to time the start and the end of +*all*+ maps

rather than the start and the end of +*individual*+ maps.
But it is hard to get the exact starting and ending times of the job's map stage.
Your proposed TPS measures the # of transactions per second of a single client under a certain
load on the cluster.
This is not completely unreasonable, but does not say much as a benchmark result imo.
I mean it is quite clear that if the cluster bears more load the clients run slower.

> Re-write NNBench to use MapReduce
> ---------------------------------
>                 Key: HADOOP-2000
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2000
>             Project: Hadoop
>          Issue Type: Test
>          Components: test
>    Affects Versions: 0.15.0
>            Reporter: Mukund Madhugiri
>            Assignee: Mukund Madhugiri
>             Fix For: 0.16.0
>         Attachments: HADOOP-2000.patch, HADOOP-2000.patch, HADOOP-2000.patch, HADOOP-2000.patch,
> The proposal is to re-write the NNBench benchmark/test to measure Namenode operations
using MapReduce. Two buckets of measurements will be done:
> 1. Transactions per second 
> 2. Average latency
> for these operations
> - Create and Close file
> - Open file
> - Rename file
> - Delete file

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message