hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Girish <>
Subject Writing to an ORCFile using MapReduce + HCatalog APIs
Date Sun, 06 Apr 2014 23:17:36 GMT

I am working on a custom Pig source code that writes RDF data into text
files. I was looking to instead *write to an ORCFile* for some of the
columnar benefits it offers.

I understand that I need to use *HCatalog APIs*. I have an idea on how to
create HCatSchema for my data. And that I would need to use the
HCatOutputFormat for writing into ORCFile.

I need some help on *how to specify the storage format as ORCFile.* I see
that ORC has built-in support. But I cannot find any examples as to how to
specify which output format the HCatalog APIs can write to (default Hive
table or RCFile or ORCFile or Sequence File etc..).

I would then need to work on reading from these ORCFiles and reconstruct
the records.

Any pointers would be appreciated. Thanks in advance.


View raw message