hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Costa <psdc1...@gmail.com>
Subject Fwd: raw length vs part length
Date Tue, 01 Feb 2011 12:19:09 GMT
I just have to make a correction and add a new question.
The correction:
PartLength: 14 bytes
Raw length: 10 bytes.

Added question:
Why the class IFileInputStream is created with a size of 14 bytes, if
the data (segments) has the size of 10 bytes, despite of all the
map-0.out file have the size of 14 bytes?

---------- Forwarded message ----------
From: Pedro Costa <psdc1978@gmail.com>
Date: Tue, Feb 1, 2011 at 11:43 AM
Subject: raw length vs part length
To: mapreduce-user@hadoop.apache.org


Hadoop uses the compressed length and the raw length.

1 - In my example, the RT is fetching a map output that shows that it
has the raw length of 14 bytes and the partLength of 10 bytes. The map
output doesn't use any compression.
When I'm dealing with uncompressed data, the raw length should be 14
and the partlength 0? I'm saying this because, the data that is being
transferred to the RT is uncompressed.

2 - The raw length of the map output is the size of the block (10
bytes) + header?

3 - part length means partition length?



View raw message