hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandra Mohan, Ananda Vel Murugan" <Ananda.Muru...@honeywell.com>
Subject Question on BytesWritable
Date Tue, 01 Oct 2013 04:39:48 GMT

I am using Hadoop 1.0.2. I have written a map reduce job. I have a requirement to process
the whole file without splitting. So I have written a new input format to process the file
as a whole by overriding the isSplittable() method. I have also created a new Record reader
implementation to read the whole file. I followed the sample in Chapter 7 of "Hadoop- The
Definitive Guide" book. In my map reduce job, my mapper emits BytesWritable as value. I want
to get the bytes and read some specific information from the bytes. I use ByteArrayInputStream
and do further processing. But strangely the following code shows different numbers. Because
of this I am getting errors.

//value -> BytesWritable
System.out.println("Bytes length " + value.getLength()); // Bytes length 1931650
byte[] bytes = value.getBytes();
System.out.println("Bytes array length"+bytes.length); //Bytes array length 2897340

My file size is 1931650 bytes. I don't know why byte array is bigger than the original file.

Any idea what is going wrong. Please help. Thanks in advance.


View raw message