Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 75A8B11EE9 for ; Thu, 28 Aug 2014 06:19:03 +0000 (UTC) Received: (qmail 68820 invoked by uid 500); 28 Aug 2014 06:18:44 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 68691 invoked by uid 500); 28 Aug 2014 06:18:44 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 68585 invoked by uid 99); 28 Aug 2014 06:18:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Aug 2014 06:18:43 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sshi@pivotal.io designates 209.85.212.177 as permitted sender) Received: from [209.85.212.177] (HELO mail-wi0-f177.google.com) (209.85.212.177) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Aug 2014 06:18:16 +0000 Received: by mail-wi0-f177.google.com with SMTP id cc10so277559wib.4 for ; Wed, 27 Aug 2014 23:18:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=aIcf6sQfIB8/h6oZ3KUBeM8KtbOKU39Sz+WwC3Hj9dc=; b=Ffm/QzHsZx7G83ELeViWHS+q0vq5+Z9Q1LN1NYoGo3wvKfQapUAmFXtpL47xutxVKV dyzmyS9CAXJ0LSjpP5R6NybMwpAO/QK3YISUQQefVUwnkhHhSWk97h1iEp+HvQA7WFlS h067bPT78Zg+WUMSwzmFCXytfxRA6ZWYkVcEl47KyuukRXLJyWD59nFqrucsGcdmaCTY g2U2uWTekjifqcrj1daLGJL1si6stiIi/Uv2jk9H3HpyWXrEXOCyHjApr0B4QXyRWcpZ QblHnHKlaFsWP4YeVcQJ/cR6arDW+TTzj7LvVEMg0erBMaEqOKWIT3XWAHDsHEdYmUkX iHXg== X-Gm-Message-State: ALoCoQn8N6eevPo8SLX8svZhXhIRbVsAQK/J4eapkn3cx9iyHika88XuGDr9VpSZmKkDxbShmFyi MIME-Version: 1.0 X-Received: by 10.180.93.5 with SMTP id cq5mr10757798wib.16.1409206696281; Wed, 27 Aug 2014 23:18:16 -0700 (PDT) Received: by 10.217.157.5 with HTTP; Wed, 27 Aug 2014 23:18:16 -0700 (PDT) In-Reply-To: References: Date: Thu, 28 Aug 2014 14:18:16 +0800 Message-ID: Subject: Re: Appending to HDFS file From: Stanley Shi To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=f46d043c806a8c2bba0501aa8046 X-Virus-Checked: Checked by ClamAV on apache.org --f46d043c806a8c2bba0501aa8046 Content-Type: text/plain; charset=UTF-8 You should not use this method: FSDataOutputStream fp = fs.create(pt, true) Here's the java doc for this "create" method: /** * Create an FSDataOutputStream at the indicated Path. * @param f the file to create * @*param** overwrite if a file with this name already exists, then if true,* * the file will be overwritten, and if false an exception will be thrown. */ public FSDataOutputStream create(Path f, boolean overwrite) throws IOException { return create(f, overwrite, getConf().getInt("io.file.buffer.size", 4096), getDefaultReplication(f), getDefaultBlockSize(f)); } On Wed, Aug 27, 2014 at 2:12 PM, rab ra wrote: > > hello > > Here is d code snippet, I use to append > > def outFile = "${outputFile}.txt" > > Path pt = new Path("${hdfsName}/${dir}/${outFile}") > > def fs = org.apache.hadoop.fs.FileSystem.get(configuration); > > FSDataOutputStream fp = fs.create(pt, true) > > fp << "${key} ${value}\n" > On 27 Aug 2014 09:46, "Stanley Shi" wrote: > >> would you please past the code in the loop? >> >> >> On Sat, Aug 23, 2014 at 2:47 PM, rab ra wrote: >> >>> Hi >>> >>> By default, it is true in hadoop 2.4.1. Nevertheless, I have set it to >>> true explicitly in hdfs-site.xml. Still, I am not able to achieve append. >>> >>> Regards >>> On 23 Aug 2014 11:20, "Jagat Singh" wrote: >>> >>>> What is value of dfs.support.append in hdfs-site.cml >>>> >>>> >>>> https://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml >>>> >>>> >>>> >>>> >>>> On Sat, Aug 23, 2014 at 1:41 AM, rab ra wrote: >>>> >>>>> Hello, >>>>> >>>>> I am currently using Hadoop 2.4.1.I am running a MR job using hadoop >>>>> streaming utility. >>>>> >>>>> The executable needs to write large amount of information in a file. >>>>> However, this write is not done in single attempt. The file needs to be >>>>> appended with streams of information generated. >>>>> >>>>> In the code, inside a loop, I open a file in hdfs, appends some >>>>> information. This is not working and I see only the last write. >>>>> >>>>> How do I accomplish append operation in hadoop? Can anyone share a >>>>> pointer to me? >>>>> >>>>> >>>>> >>>>> >>>>> regards >>>>> Bala >>>>> >>>> >>>> >> >> >> -- >> Regards, >> *Stanley Shi,* >> >> -- Regards, *Stanley Shi,* --f46d043c806a8c2bba0501aa8046 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
You should not use this method:
FSDataO= utputStream fp =3D fs.create(pt, true)

Here's the java doc for this "create&= quot; method:

=C2=A0=C2=A0/**

=C2=A0=C2=A0 * Create an FSDataOutputStream at the indicated = Path.

=C2=A0=C2=A0 * @param f the file to c= reate

=C2=A0=C2=A0 * @param overw= rite if a file with this name already exists, then if true,

=C2=A0=C2=A0 * =C2=A0 the file will be overwritten, and if fa= lse an exception will be thrown.

=C2=A0=C2=A0 */

=C2=A0 public FSDataOutputStream crea= te(Path f, boolean overwrite)

=C2=A0 =C2=A0 =C2=A0 throws IOExcepti= on {

=C2=A0 =C2=A0 return create(f, overwr= ite,=C2=A0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 getConf().getInt("io.file.buffer.size", 4096),

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 getDefaultReplication(f),

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 getDefaultBlockSize(f));

=C2=A0 }



On Wed, Aug 27, 2014 at 2:12 PM, rab ra <rabmdu@g= mail.com> wrote:


hello

Here is d code snippet, I use to append

def outFile =3D "${outputFile}.txt"

Path pt =3D new Path("${hdfsName}/${dir}/${outFile}")

def fs =3D org.apache.hadoop.fs.FileSystem.get(configuration);

FSDataOutputStream fp =3D fs.create(pt, true)

fp << "${key} ${value}\n"

On 27 Aug 2014 09:46, "Stanley Shi" &l= t;sshi@pivotal.io&= gt; wrote:
would you please past the code in the loop?=C2=A0


On Sat, Aug 23, = 2014 at 2:47 PM, rab ra <rabmdu@gmail.com> wrote:

Hi

By default, it is true in hadoop 2.4.1. Nevertheless, I have set it to t= rue explicitly in hdfs-site.xml. Still, I am not able to achieve append.

Regards

On 23 Aug 2014 11:20, "Jagat Singh" &l= t;jagatsingh@gmai= l.com> wrote:
What is value of=C2=A0dfs.support.append in hdfs-site.cml

<= span style=3D"line-height:17.280000686645508px">



On Sat, Aug 23, 2014 at 1:41 AM, rab ra = <rabmdu@gmail.com> wrote:
Hello,

I am currently using Hadoop 2.4.= 1.I am running a MR job using hadoop streaming utility.=C2=A0
The executable needs to write large amount of information in a = file. However, this write is not done in single attempt. The file needs to = be appended with streams of information generated.=C2=A0

In the code, inside a loop, I open a file in hdfs, appe= nds some information. This is not working and I see only the last write.=C2= =A0

How do I accomplish append operation in hadoop= ? Can anyone share a pointer to me?




regards
Bala




--
=
Regards,
Stanley Shi,




--
=
Regards,
Stanley Shi,

--f46d043c806a8c2bba0501aa8046--