hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From imcaptor <imcap...@gmail.com>
Subject Re: how to improve the Hadoop's capability of dealing with small files
Date Thu, 07 May 2009 02:39:59 GMT
Please try  -D dfs.block.size=4096000
The specification must be in bytes.

On Tue, May 5, 2009 at 4:47 AM, Christian Ulrik Søttrup <soettrup@nbi.dk
- 隐藏引用文字 -

> Hi all,
> I have a job that creates very big local files so i need to split it to as
> many mappers as possible. Now the DFS block size I'm
> using means that this job is only split to 3 mappers. I don't want to
> change the hdfs wide block size because it works for my other jobs.
> Is there a way to give a specific file a different block size. The
> documentation says it is, but does not explain how.
> I've tried:
> hadoop dfs -D dfs.block.size=4M -put file  /dest/
> But that does not work.
> any help would be apreciated.
> Cheers,
> Chrulle

2009/5/7 陈桂芬 <chenguifen_hz@163.com>

> Hi:
> In my application, there are many small files. But the hadoop is designed
> to deal with many large files.
> I want to know why hadoop doesn’t support small files very well and where
> is the bottleneck. And what can I do to improve the Hadoop’s capability of
> dealing with small files.
> Thanks.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message