hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Migrating Variable Length Files to Hive
Date Fri, 02 Jun 2017 16:41:44 GMT
On Fri, Jun 2, 2017 at 12:07 PM, Nishanth S <nishanth.2884@gmail.com> wrote:

> Hello hive users,
>
> We are looking at migrating  files(less than 5 Mb of data in total) with
> variable record lengths from a mainframe system to hive.You could think of
> this as metadata.Each of these records can have columns  ranging from 3 to
>  n( means  each record type have different number of columns) based on
> record type.What would be the best strategy to migrate this  to hive .I was
> thinking of converting these files  into one  variable length csv file and
> then importing them to a hive table .Hive table will consist of 4 columns
> with the 4th column having comma separated list of  values from column
> column 4 to n.Are there other alternative or better approaches for this
> solution.Appreciate any  feedback on this.
>
> Thanks,
> Nishanth
>

Hive supports complex types like List, Map, and Struct and they can be
arbitrarily nested. If the nested data has a schema that may be your best
option. Potentially using thrift/avro/parquet/protobuf support.

Otherwise you can store the data as Json and at read time parse things out
using json udfs.

Edward

Mime
View raw message