hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nishanth S <nishanth.2...@gmail.com>
Subject Re: Migrating Variable Length Files to Hive
Date Fri, 02 Jun 2017 17:01:10 GMT
Thanks Edward .  I am leaning towards using array .My nested data does not
have a schema .It  is a collection of strings and the number of strings can
vary.



On Fri, Jun 2, 2017 at 10:41 AM, Edward Capriolo <edlinuxguru@gmail.com>
wrote:

>
>
> On Fri, Jun 2, 2017 at 12:07 PM, Nishanth S <nishanth.2884@gmail.com>
> wrote:
>
>> Hello hive users,
>>
>> We are looking at migrating  files(less than 5 Mb of data in total) with
>> variable record lengths from a mainframe system to hive.You could think of
>> this as metadata.Each of these records can have columns  ranging from 3 to
>>  n( means  each record type have different number of columns) based on
>> record type.What would be the best strategy to migrate this  to hive .I was
>> thinking of converting these files  into one  variable length csv file and
>> then importing them to a hive table .Hive table will consist of 4 columns
>> with the 4th column having comma separated list of  values from column
>> column 4 to n.Are there other alternative or better approaches for this
>> solution.Appreciate any  feedback on this.
>>
>> Thanks,
>> Nishanth
>>
>
> Hive supports complex types like List, Map, and Struct and they can be
> arbitrarily nested. If the nested data has a schema that may be your best
> option. Potentially using thrift/avro/parquet/protobuf support.
>
> Otherwise you can store the data as Json and at read time parse things out
> using json udfs.
>
> Edward
>

Mime
View raw message