hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rab ra <rab...@gmail.com>
Subject Sequence files and merging
Date Sun, 24 Aug 2014 06:39:17 GMT

I need few clarifications for the following questions related to

1. I have a bunch of sequence file. Each file has 8 keys and corresponding
values. The values are float array bytes, and key is a name which is a
string.  Now, storing these smaller files and processing is not efficient
as there can be milliions of such files. Hence, I am thinking of creating
one sequence file out of such large number of files. Is it possible? I read
in the literature that there are ways to merge sequence files. My question
is that if I merge large number of sequence files, how can I retrieve
individual small sequence file in my map processes?

2. when I merge, it becomes a different sequence file altogether with keys
merged? If this is the case, my keys will be same for all the files. How it
will be handled?  Will there be any problem here?

3. Is it possible to append keys and values to existing sequence file?


View raw message