hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rab ra <rab...@gmail.com>
Subject Sequential files sizes
Date Tue, 02 Sep 2014 18:29:40 GMT

In one my use-cases, I generate large number of sequential files. In all of
these files, I store a bunch of key/value pairs. The key is a string, and
value is a list of FLOAT values. I know the number of float values that I
am storing, and based on which I am estimating the size of the file to be
around 700KB (approximately). However, when I see size in HDFS, it shows
very less, something around 20KB. I am not using compression technique
while writing the sequence files. Any clue here?


View raw message