Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D866109A3 for ; Tue, 28 Jan 2014 07:35:58 +0000 (UTC) Received: (qmail 70049 invoked by uid 500); 28 Jan 2014 07:35:50 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 69728 invoked by uid 500); 28 Jan 2014 07:35:43 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 69721 invoked by uid 99); 28 Jan 2014 07:35:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jan 2014 07:35:41 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of aagarwal@hortonworks.com designates 209.85.160.176 as permitted sender) Received: from [209.85.160.176] (HELO mail-yk0-f176.google.com) (209.85.160.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jan 2014 07:35:37 +0000 Received: by mail-yk0-f176.google.com with SMTP id 131so50565ykp.7 for ; Mon, 27 Jan 2014 23:35:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=PruQO4Msa82t/ReZuu7D4pzarnDvSUZ+VmGbuMMP2Qk=; b=lG5K/YgWV1LxzPf4hh/ySArNR1DfasBSXlRzkmtRW8uPYJ/bJzI6iPxzOdc10lsM8z LCZWEr/y6VkNgn2LrpGb0Lv4H9qXz2aZrnpeYgDTJm9RrD3hYAcQKm8Rq+eA10RCQT97 cysEGhAVxB5M8ACAkgEMK+QzcZmwvvAZwbuRgShl3yzvRki17OVmNewvCTnvw7gQFSFj QLGsoZTT4QBLSFuLaPjEgttd8fddCuwCWMYeKwig05+XoCtWLueFMXV3/Faqcn10RfCA tXGKOuExJwTYG0/gj/3XPCWlxgm3uJNfyQkXysQKSDzTWpAwmhDBCAF4ZpCKr9EICXuK M9HQ== X-Gm-Message-State: ALoCoQlW4rkFNHVfMsgIoVRIwH+Uv9qyooeeWZ8CsWO+0xNd8Z8blIzkJmA/SHyzueWSPzDxvPXYGmjwr11pm+20NpNIwM6NnjVmksbY50EMNswgvymKZfM= MIME-Version: 1.0 X-Received: by 10.236.84.18 with SMTP id r18mr44059yhe.99.1390894516328; Mon, 27 Jan 2014 23:35:16 -0800 (PST) Received: by 10.170.210.85 with HTTP; Mon, 27 Jan 2014 23:35:16 -0800 (PST) In-Reply-To: <869970D71E26D7498BDAC4E1CA92226B86E1F041@MBX021-E3-NJ-2.exch021.domain.local> References: <869970D71E26D7498BDAC4E1CA92226B86E1CA84@MBX021-E3-NJ-2.exch021.domain.local> <869970D71E26D7498BDAC4E1CA92226B86E1DF22@MBX021-E3-NJ-2.exch021.domain.local> <869970D71E26D7498BDAC4E1CA92226B86E1F041@MBX021-E3-NJ-2.exch021.domain.local> Date: Mon, 27 Jan 2014 23:35:16 -0800 Message-ID: Subject: Re: HDFS buffer sizes From: Arpit Agarwal To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf3011e29d90ef5d04f102ddbe X-Virus-Checked: Checked by ClamAV on apache.org --20cf3011e29d90ef5d04f102ddbe Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Looks like DistributedFileSystem ignores it though. On Sat, Jan 25, 2014 at 6:09 AM, John Lilley wrot= e: > There is this in FileSystem.java, which would appear to use the default > buffer size of 4096 in the create() call unless otherwise specified in > *io.file.buffer.size* > > > > public FSDataOutputStream create(Path f, short replication, > > Progressable progress) throws IOException { > > return create(f, true, > > getConf().getInt( > > > CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY, > > > CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEF= AULT), > > replication, > > getDefaultBlockSize(f), progress); > > } > > > > But this discussion is missing the point; I really want to know, is there > any benefit to setting a larger bufferSize in FileSystem.create() and > FileSystem.append()? > > > > *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com] > *Sent:* Friday, January 24, 2014 9:35 AM > > *To:* user@hadoop.apache.org > *Subject:* Re: HDFS buffer sizes > > > > I don't think that value is used either except in the legacy block reader > which is turned off by default. > > > > On Fri, Jan 24, 2014 at 6:34 AM, John Lilley > wrote: > > Ah, I see=85 it is a constant > > CommonConfigurationKeysPublic.java: public static final int > IO_FILE_BUFFER_SIZE_DEFAULT =3D 4096; > > Are there benefits to increasing this for large reads or writes? > > john > > > > *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com] > *Sent:* Thursday, January 23, 2014 3:31 PM > *To:* user@hadoop.apache.org > *Subject:* Re: HDFS buffer sizes > > > > HDFS does not appear to use dfs.stream-buffer-size. > > > > On Thu, Jan 23, 2014 at 6:57 AM, John Lilley > wrote: > > What is the interaction between dfs.stream-buffer-size and > dfs.client-write-packet-size? > > I see that the default for dfs.stream-buffer-size is 4K. Does anyone hav= e > experience using larger buffers to optimize large writes? > > Thanks > > > John > > > > > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential= , > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified th= at > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediate= ly > and delete it from your system. Thank You. > > > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential= , > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified th= at > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediate= ly > and delete it from your system. Thank You. > --=20 CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to= =20 which it is addressed and may contain information that is confidential,=20 privileged and exempt from disclosure under applicable law. If the reader= =20 of this message is not the intended recipient, you are hereby notified that= =20 any printing, copying, dissemination, distribution, disclosure or=20 forwarding of this communication is strictly prohibited. If you have=20 received this communication in error, please contact the sender immediately= =20 and delete it from your system. Thank You. --20cf3011e29d90ef5d04f102ddbe Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
Looks like DistributedFileSystem ignores it though.


On Sat, Ja= n 25, 2014 at 6:09 AM, John Lilley <john.lilley@redpoint.net>= ; wrote:

There is this in FileSyst= em.java, which would appear to use the default buffer size of 4096 in the c= reate() call unless otherwise specified in io.file.buffer.size

=A0

=A0 public FSDataOutputSt= ream create(Path f, short replication,

=A0=A0=A0=A0=A0=A0Progres= sable progress) throws IOException {

=A0=A0=A0 return create(f= , true,

=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0getConf().getInt(

=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 CommonConfigurationKeysPublic.IO_FI= LE_BUFFER_SIZE_KEY,

=A0 =A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0CommonConfigurationKeysPublic.IO_FIL= E_BUFFER_SIZE_DEFAULT),

=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 replication,

=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0 getDefaultBlockSize(f), progress);

=A0 }

=A0<= /p>

But this discussion is mi= ssing the point; I really want to know, is there any benefit to setting a l= arger bufferSize in FileSystem.create() and FileSystem.append()?<= /u>

=A0<= /p>

From: Arpit Ag= arwal [mailto:aagarwal@hortonworks.com]
Sent: Friday, January 24, 2014 9:35 AM

<= br> To: user= @hadoop.apache.org
Subject: Re: HDFS buffer sizes

=A0

I don't think that value is used either except i= n the legacy block reader which is turned off by default.

=A0

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <john.lilley@redp= oint.net> wrote:

Ah, I see=85 it is a cons= tant

CommonConfigurationKeysPu= blic.java:=A0 public static final int IO_FILE_BUFFER_SIZE_DEFAULT =3D 4096;=

Are there benefits to inc= reasing this for large reads or writes?

john=

=A0<= /p>

From: Arpit Ag= arwal [mailto:aagarwal@hortonworks.com]
Sent: Thursday, January 23, 2014 3:31 PM
To: user= @hadoop.apache.org
Subject: Re: HDFS buffer sizes

=A0

HDFS does not appear to use dfs.stream-buffer-size.<= u>

=A0

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <john.lilley@redp= oint.net> wrote:

What is the interaction between dfs.s= tream-buffer-size and dfs.c= lient-write-packet-size?

I see that the default for dfs.stream-buffer-size is= 4K.=A0 Does anyone have experience using larger buffers to optimize large = writes?

Thanks=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0

John<= /u>

=A0

=A0


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to= which it is addressed and may contain information that is confidential, pr= ivileged and exempt from disclosure under applicable law. If the reader of = this message is not the intended recipient, you are hereby notified that any printing, copying, disseminati= on, distribution, disclosure or forwarding of this communication is strictl= y prohibited. If you have received this communication in error, please cont= act the sender immediately and delete it from your system. Thank You.

=A0


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to= which it is addressed and may contain information that is confidential, pr= ivileged and exempt from disclosure under applicable law. If the reader of = this message is not the intended recipient, you are hereby notified that any printing, copying, disseminati= on, distribution, disclosure or forwarding of this communication is strictl= y prohibited. If you have received this communication in error, please cont= act the sender immediately and delete it from your system. Thank You.



CONFIDENTIALITY NOTICE
NOTICE: This message is = intended for the use of the individual or entity to which it is addressed a= nd may contain information that is confidential, privileged and exempt from= disclosure under applicable law. If the reader of this message is not the = intended recipient, you are hereby notified that any printing, copying, dis= semination, distribution, disclosure or forwarding of this communication is= strictly prohibited. If you have received this communication in error, ple= ase contact the sender immediately and delete it from your system. Thank Yo= u. --20cf3011e29d90ef5d04f102ddbe--