Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1123A7E8D for ; Mon, 18 Jul 2011 04:56:31 +0000 (UTC) Received: (qmail 59810 invoked by uid 500); 18 Jul 2011 04:56:28 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 59704 invoked by uid 500); 18 Jul 2011 04:56:15 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 59695 invoked by uid 99); 18 Jul 2011 04:56:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jul 2011 04:56:11 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of todd@cloudera.com designates 209.85.160.176 as permitted sender) Received: from [209.85.160.176] (HELO mail-gy0-f176.google.com) (209.85.160.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jul 2011 04:56:05 +0000 Received: by gyb11 with SMTP id 11so1577924gyb.35 for ; Sun, 17 Jul 2011 21:55:44 -0700 (PDT) Received: by 10.101.214.10 with SMTP id r10mr4977928anq.115.1310964943294; Sun, 17 Jul 2011 21:55:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.101.82.12 with HTTP; Sun, 17 Jul 2011 21:55:20 -0700 (PDT) In-Reply-To: References: <2108528544.2669.1309926736965.JavaMail.tomcat@hel.zones.apache.org> From: Todd Lipcon Date: Sun, 17 Jul 2011 21:55:20 -0700 Message-ID: Subject: Re: [jira] [Created] (HDFS-2129) Simplify BlockReader to not inherit from FSInputChecker To: hdfs-dev@hadoop.apache.org Content-Type: multipart/alternative; boundary=005045029843c2133104a850cf32 X-Virus-Checked: Checked by ClamAV on apache.org --005045029843c2133104a850cf32 Content-Type: text/plain; charset=ISO-8859-1 For benchmarking CPU, I start a pseudo-distributed HDFS cluster, put a smallish file on the local datanode (such that it fits in buffer cache), and then use the following script with various parameters to look at CPU usage to cat the file. for example: $ REPS_PER_RUN=50 NUM_TRIALS=10 ./read-benchmark.sh hdfs://localhost/128M-file /tmp/benchmark-results.txt Script: #!/bin/sh -x set -e BINDIR=$(dirname $0) INPUT=$1 OUTPUT=$2 NUM_TRIALS=${NUM_TRIALS:-10} HADOOP=${HADOOP:-./bin/hadoop} HADOOP_FLAGS=${HADOOP_FLAGS:--Dio.file.buffer.size=$[64*1024]} REPS_PER_RUN=${REPS_PER_RUN:-1} HEADER="major\tminor\tfs_in\tfs_out\twall\tuser\tsys\tctx_invol\tctx_vol\n" TIME_FORMAT="%F\t%R\t%I\t%O\t%e\t%U\t%S\t%c\t%w" ! test -f $OUTPUT && printf $HEADER > $OUTPUT for x in `seq 1 $NUM_TRIALS` ; do /usr/bin/time --append -o $OUTPUT -f $TIME_FORMAT \ $HADOOP fs $HADOOP_FLAGS -cat $(for rep in $(seq 1 $REPS_PER_RUN) ; do echo $INPUT ; done) > /dev/null done On Wed, Jul 6, 2011 at 1:16 AM, Keren Ouaknine wrote: > Hello, > > I am working on the optimization of task scheduling for Hadoop and would > like to benchmark with* Apache Hadoop's standards benchmarks*. So far, I > used my own scripts to measure and monitor. Where can I find the > benchmarking you are referring to please? > > Thanks, > Keren > > On Wed, Jul 6, 2011 at 7:32 AM, Todd Lipcon (JIRA) > wrote: > > > Simplify BlockReader to not inherit from FSInputChecker > > ------------------------------------------------------- > > > > Key: HDFS-2129 > > URL: https://issues.apache.org/jira/browse/HDFS-2129 > > Project: Hadoop HDFS > > Issue Type: Sub-task > > Components: hdfs client > > Reporter: Todd Lipcon > > Assignee: Todd Lipcon > > > > > > BlockReader is currently quite complicated since it has to conform to the > > FSInputChecker inheritance structure. It would be much simpler to > implement > > it standalone. Benchmarking indicates it's slightly faster, as well. > > > > -- > > This message is automatically generated by JIRA. > > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > > > > > > > > -- > Keren Ouaknine > Cell: +972 54 2565404 > Web: www.kereno.com > -- Todd Lipcon Software Engineer, Cloudera --005045029843c2133104a850cf32--