hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew John <tmatthewjohn1...@gmail.com>
Subject Re: Sort with customized input/output !!
Date Wed, 08 Sep 2010 15:02:39 GMT
Thanks for the reply Ted !!

What I understand is that a SequenceFile will have a header followed by the
records in a format : Recordlength,Keylength,Key,Value with a sync marker
coming at some regular interval..

It would be great if someone can take a look at the following..

Q 1) The thing is my file is basically in the format : header ( a different
one) followed by Record (Key Value). In this case the size of Record and Key
is fixed.I would like to know* if I can modify the core code to make the
SequenceFile format like this *. If yes what code should I look at ??

Q 2) *What is a Sync marker (can we define it )* ? Obviously my file would
not be having this. Can someone suggest a way to get around this obstacle.
My final aim is to take this file in , sort it with respect to Key and print
the sorted file ..


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message