hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew John <tmatthewjohn1...@gmail.com>
Subject Strange output in modified TeraSort
Date Tue, 21 Sep 2010 11:13:21 GMT
Hi all ,

      I am working on a Sort function which takes in records of 40 bytes ( 8
bytes longwritable key and 32 bytes bytes byteswritable key ) and sorts them
and output them. For this I have got a modified Terasort working (thanks to
Jeff !) . Since the long long type in c  and java long are not compatible,
(for eg . 1 in C long long --> 0001 0000 0000 0000 and in Java long --> 0000
0000 0000 0100 ) I have modified the write and read in output/input to take
(convert from C format) in the value in java format , sort it and finally
write it back in C format in my output..

Now the strange thing is that ::

If I give an input file with records having key1 = 3,value1 = 0, key2 =
2,value2 =0, key 3 = 1,value3 = 0
the output file gives perfect C compatible output in octal dump -->

0000000 0001 0000 0000 0000 0000 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
0000040 0000 0000 0000 0000 0002 0000 0000 0000
0000060 0000 0000 0000 0000 0000 0000 0000 0000
*
0000120 0003 0000 0000 0000 0000 0000 0000 0000
0000140 0000 0000 0000 0000 0000 0000 0000 0000
0000160 0000 0000 0000 0000


Now if I give a input file with key1 = 1,value1 =0, key 2 = 1,value2 = 0,
key 3 =1,value3 = 0, I get the sorted output file as --->

0000000 0001 0000 0000 0000 0000 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
0000040 0000 0000 0000 0000 *0000 0000 0000 0100*
0000060 0000 0000 0000 0000 0000 0000 0000 0000
*
0000120 0001 0000 0000 0000 0000 0000 0000 0000
0000140 0000 0000 0000 0000 0000 0000 0000 0000
0000160 0000 0000 0000 0000

This is strange since the second key should ve printed out --> 0001 0000
0000 0000 !!! Notice this happens only with the even no. repeating key !!

Please guide me on this !!

Regards,
Matthew John

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message