hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul p <rahulpoolancha...@gmail.com>
Subject Re: Handling files with unclear boundaries
Date Mon, 06 Aug 2012 15:45:08 GMT
Hi Tariq,
Can you accept my gtalk request.

On Mon, Aug 6, 2012 at 11:30 PM, Mohammad Tariq <dontariq@gmail.com> wrote:

> Hello list,
>      I need some guidance on how to handle files where we don't have
> any proper delimiters or record boundaries. Actually I am trying to
> process a set of file that are totally alien to me (SAS XPT files)
> through MR. But one thing that is always fixed is that each time I
> have to read 107 bytes from the line. Is it possible to use this
> length as a delimiter for creating splits some how??And if so which
> InputFormat would be appropriate??Many thanks.
> Regards,
>     Mohammad Tariq

View raw message