hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Periya.Data" <periya.d...@gmail.com>
Subject example of splitting a binary file
Date Thu, 15 Sep 2011 21:19:42 GMT
Hi all,
    Is there a nice example that shows how to split a large binary file into
splits? If there is one, please let me know. It will be a great place to for
me to start.

More ideally, I want to create a custom InputFormat from
sequenceFileAsBinaryInputFormat and a custom record-reader that can properly
read well-defined records (with known offsets) in my binary input file.

But, for now, to begin, I want to learn the basics => read a binary file,
break it into splits of known size and play with a record-reader and get
some output. I do not want to do any map-reduce yet on them. Once I know how
to do those, I can gradually build on it.

Please let me know if there are any links to such examples.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message