hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: File formats in Hadoop
Date Sat, 19 Mar 2011 16:26:56 GMT

On Sat, Mar 19, 2011 at 9:31 PM, Weishung Chung <weishung@gmail.com> wrote:
> I am browsing through the hadoop.io package and was wondering what other
> file formats are available in hadoop other than SequenceFile and TFile?

Additionally, on Hadoop, there're MapFiles/SetFiles (Derivative of
SequenceFiles, if you need maps/sets), and IFiles (Used by the
map-output buffers to produce a key-value file for Reducers to use,
internal use only).

Apache Hive use RCFiles, which is very interesting too. Apache Avro
provides Avro-Datafiles that are designed for use with Hadoop
Map/Reduce + Avro-serialized data.

I'm not sure of this one, but Pig probably was implementing a
table-file-like solution of their own a while ago. Howl?

Harsh J

View raw message