hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Re: TestDFSIO failure
Date Fri, 02 Sep 2011 00:27:25 GMT
Hi Matt,

On Jun 20, 2011, at 1:46pm, GOEKE, MATTHEW (AG/1000) wrote:

> Has anyone else run into issues using output compression (in our case lzo) on TestDFSIO
and it failing to be able to read the metrics file? I just assumed that it would use the correct
decompression codec after it finishes but it always returns with a 'File not found' exception.

Yes, I've run into the same issue on 0.20.2 and CHD3u0

I don't see any Jira issue that covers this problem, so unless I hear otherwise I'll file

The problem is that the post-job code doesn't handle getting the <path>.deflate or <path>.lzo
(for you) file from HDFS, and then decompressing it.

> Is there a simple way around this without spending the time to recompile a cluster/codec
specific version?

You can use "hadoop fs -text <path reported in exception>.lzo"

This will dump out the file, which looks like:

f:rate  171455.11
f:sqrate        2981174.8
l:size  10485760000
l:tasks 10
l:time  590537

If you take f:rate/1000/l:tasks, that should give you the average MB/sec.

E.g. for the example above, that would be 171455/1000/10 = 17MB/sec.

-- Ken

Ken Krugler
+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr

View raw message