hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Twensky <jim.twen...@gmail.com>
Subject Re: How large is one file split?
Date Tue, 14 Apr 2009 19:15:59 GMT
Files are stored as blocks and the default block size is 64MB. You can
change this by setting the dfs.block.size property. Map/Reduce interprets
files in large chunks of bytes and these are called splits. Splits are not
physical, think about them as being logical data structures that tell you
the starting byte position in the file and the length of the split. Each
mapper generally takes a split and precesses it. You can also configure the
minimum split size by setting the mapred.min.split.size property. Hope this
helps.

-Jim

On Tue, Apr 14, 2009 at 1:05 PM, Foss User <fossist@gmail.com> wrote:

> In the documentation I was reading that files are stored as file
> splits in the HDFS. What is the size of each file split? Is it
> configurable? If yes, how can I configure it?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message