hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Mainframe to ASCII conversion
Date Mon, 11 Feb 2013 17:01:21 GMT

If the data is straight EBCDIC you have somewhat splittable data, however its really better
to do this in a single stream.

If the data is COMP-3 (Zoned and Packed Data),  you will be unable to split the file in to
pieces. You will also need to know the fixed length format of the record. 

From my personal experience, most of the data that I had seen was COMP-3 records which required
knowing the data structures. 

Of course YMMV

On Feb 8, 2013, at 9:23 PM, Jagat Singh <jagatsingh@gmail.com> wrote:

> Hi,
> I am thinking to write some mapper to do conversion of mainframe files to ascii format
and contribute back.
> And before even i do something i wanted to confirm from you guys the following 
> Do we already have some mapreduce library doing the same work ?
> Is there anything in Hadoop which makes such kind of conversion not possible , so that
i dont end up spending time on something which cannot be done
> I am not mainframe guy so wanted to ask upfront.
> Here is what in my mind till now
> In Oracle JDK following are supported encodings [1] , i plan to use already existing
libraries such as [2] or [3] to do the conversion.
> Thank you for your time and guidance.
> Regards,
> Jagat Singh
> 1) http://docs.oracle.com/javase/6/docs/technotes/guides/intl/encoding.doc.html
> 2) http://sourceforge.net/projects/jrecord/
> 3) http://sourceforge.net/projects/cb2java/

Michael Segel  | (m) 312.755.9623

Segel and Associates

View raw message