hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Writing small files to one big file in hdfs
Date Tue, 21 Feb 2012 17:23:50 GMT
I'd recommend making a SequenceFile[1] to store each XML file as a value.



On Tue, Feb 21, 2012 at 12:15 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:

> We have small xml files. Currently I am planning to append these small
> files to one file in hdfs so that I can take advantage of splits, larger
> blocks and sequential IO. What I am unsure is if it's ok to append one file
> at a time to this hdfs file
> Could someone suggest if this is ok? Would like to know how other do it.

Joseph Echeverria
Cloudera, Inc.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message