hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject RE: HDFS buffer sizes
Date Sat, 25 Jan 2014 14:09:37 GMT
There is this in FileSystem.java, which would appear to use the default buffer size of 4096
in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting
a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Friday, January 24, 2014 9:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned
off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <john.lilley@redpoint.net<mailto:john.lilley@redpoint.net>>
wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT =
4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<mailto:aagarwal@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <john.lilley@redpoint.net<mailto:john.lilley@redpoint.net>>
wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using
larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed
and may contain information that is confidential, privileged and exempt from disclosure under
applicable law. If the reader of this message is not the intended recipient, you are hereby
notified that any printing, copying, dissemination, distribution, disclosure or forwarding
of this communication is strictly prohibited. If you have received this communication in error,
please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed
and may contain information that is confidential, privileged and exempt from disclosure under
applicable law. If the reader of this message is not the intended recipient, you are hereby
notified that any printing, copying, dissemination, distribution, disclosure or forwarding
of this communication is strictly prohibited. If you have received this communication in error,
please contact the sender immediately and delete it from your system. Thank You.

Mime
View raw message