hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mapred Learn <mapred.le...@gmail.com>
Subject Re: Sequence File usage queries
Date Thu, 24 Feb 2011 00:24:03 GMT
Thanks !

In this case, how can we print the metadata associated with the data
(sequence files), if user accessing this data wants to know it:
i) Is there any hadoop command that can do it ?
ii) Or we will have to provide some interface to the user to see the
metadata ?

-JJ

On Sat, Feb 19, 2011 at 9:17 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Option 2 is better.
> Please see this in SequenceFile:
>   public static Writer
>     createWriter(FileSystem fs, Configuration conf, Path name,
>                  Class keyClass, Class valClass, int bufferSize,
>                  short replication, long blockSize,
>                  CompressionType compressionType, CompressionCodec codec,
>                  Progressable progress, Metadata metadata) throws
> IOException {
>
>
>
> On Thu, Feb 17, 2011 at 1:16 PM, Mapred Learn <mapred.learn@gmail.com>wrote:
>
>> Hi,
>> I have a use case to upload some tera-bytes of text files as sequences
>> files on HDFS.
>>
>> These text files have several layouts ranging from 32 to 62 columns
>> (metadata).
>>
>> What would be a good way to upload these files along with their metadata:
>>
>> i) creating a key, value class per text file layout and use it to create
>> and upload as sequence files ?
>>
>> ii) create SequenceFile.Metadata header in each file being uploaded as
>> sequence file individually ?
>>
>> Any inputs are appreciated !
>>
>> Thanks
>> -JJ
>>
>
>

Mime
View raw message