orc-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Piyush Mukati (Data Platform)" <piyush.muk...@flipkart.com>
Subject Converting json record to ORC.
Date Mon, 20 Feb 2017 07:14:07 GMT
we have a use case where our MR job have to read from old json (data where
each line is a json with fixed schema) and ORC files. The output of the job
will be in ORC file.

I tried some approaches.

1)  Hcatalog but it was not having support for reading from multiple tables
as of now. Json data don't have hive tables too.

 2) With the help of hive ORC lib and serde.
But unable to pass orc Struct through shuffle phase. As they don't
implement writable.(I am creating ORCStruct in mapper)

3) Currently I am checking org.apache.orc.mapreduce apis. everything is
good here. I have to convert exiting json record to Orcstruct.
This looks a common use-case. Writing a converter myself look like

Hoping if anyone in community aware of any utils which can help me in
converting json to ORCStruct. Any other suggestion is well come.


View raw message