hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rex X <dnsr...@gmail.com>
Subject Re: What is the best way to locate the offset and length of all fields in a Hadoop sequential text file?
Date Fri, 22 Jan 2016 17:41:43 GMT
Hi LLoyd,

The metadata of this table are STRINGs, BIGINTs, and TINYINTs, 2000
attributes in total.

I need to transform the data that cannot be done with built-in functions of

Thank you.


On Fri, Jan 22, 2016 at 1:58 AM, Namikaze Minato <lloydsensei@gmail.com>

> Hello. We don't have any information about your data.
> I don't think we can help you with this. Also, I cannot understand what
> you are trying to achieve. Please also tell us why you are using hadoop
> streaming instead of hive to do your operations.
> Regards,
> LLoyd
> On 22 January 2016 at 06:30, Rex X <dnsring@gmail.com> wrote:
>> The given sequential files correspond to an external Hive table.
>> They are stored in
>> /tableName/part-00000
>> /tableName/part-00001
>> ...
>> There are about 2000 attributes in the table. Now I want to process the
>> data using Hadoop streaming and mapReduce. The first step is to find the
>> offset and length for each attribute.
>> What is the best way to get this information?

View raw message