Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B75C511E1D for ; Thu, 10 Jul 2014 18:00:53 +0000 (UTC) Received: (qmail 70769 invoked by uid 500); 10 Jul 2014 18:00:45 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 70662 invoked by uid 500); 10 Jul 2014 18:00:45 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 70652 invoked by uid 99); 10 Jul 2014 18:00:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jul 2014 18:00:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of aagarwal@hortonworks.com designates 209.85.219.44 as permitted sender) Received: from [209.85.219.44] (HELO mail-oa0-f44.google.com) (209.85.219.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jul 2014 18:00:42 +0000 Received: by mail-oa0-f44.google.com with SMTP id eb12so717945oac.17 for ; Thu, 10 Jul 2014 11:00:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=ZGhdiAu3s18lUYQZTCKsWK5bqYO59FSCt/wwm6RtvQQ=; b=lL69MMCA4LU4h8F22vMOeJP/IRw/62j6x/wk6zf2/p4Qp2VtaJOyMYo0XYOZC3pczI 6VjlGMT2Os5c8BBDZy2uuQLIkjeQ4NsRxZWToCZB+dk1tDUUCja7pfpj77JuHjyS74mk gvMjIYIVgbH+aG3jNBlqgWQUOOwUc50sphIHYhuee5vjAPV5q8SOswYhijN7Cl8Z7Zk8 I3YMdr6n55WFhptmy/HGYaqkPuptSAHrgkEGOB2lHmc+NaNlWb0OmVXGSou0k/TvuZXP fgf5c5ev+kdHDepZ59LqNCm+6J6yYPVCTg44+n9knKdbi3y33kMNfzGjy4TteAYqiy8Y kn4A== X-Gm-Message-State: ALoCoQnDfe6TdlqKqQy+GT+nVAiArWH/V6ENN5DFjCguFM5fhvMK/Jf3Ztc2sozQ8Y2tivEbElgShhv2WzdWTiC0BIQmYuB6xkxAgKbALiloAbOQRvYMho0= MIME-Version: 1.0 X-Received: by 10.60.84.233 with SMTP id c9mr56056252oez.0.1405015218043; Thu, 10 Jul 2014 11:00:18 -0700 (PDT) Received: by 10.76.167.199 with HTTP; Thu, 10 Jul 2014 11:00:17 -0700 (PDT) In-Reply-To: References: Date: Thu, 10 Jul 2014 11:00:17 -0700 Message-ID: Subject: Re: multiple map tasks writing in same hdfs file -issue From: Arpit Agarwal To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=089e01184956f9d54604fdda9889 X-Virus-Checked: Checked by ClamAV on apache.org --089e01184956f9d54604fdda9889 Content-Type: text/plain; charset=UTF-8 HDFS is single-writer, multiple-reader (see sec 8.3.1 of http://aosabook.org/en/hdfs.html). You cannot have multiple writers for a single file at a time. On Thu, Jul 10, 2014 at 2:55 AM, rab ra wrote: > Hello > > > I have one use-case that spans multiple map tasks in hadoop environment. I > use hadoop 1.2.1 and with 6 task nodes. Each map task writes their output > into a file stored in hdfs. This file is shared across all the map tasks. > Though, they all computes thier output but some of them are missing in the > output file. > > > > The output file is an excel file with 8 parameters(headings). Each map > task is supposed to compute all these 8 values, and save it as soon as it > is computed. This means, the programming logic of a map task opens the > file, writes the value and close, 8 times. > > > > Can someone give me a hint on whats going wrong here? > > > > Is it possible to make more than one map task to write in a shared file in > HDFS? > > Regards > Rab > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. --089e01184956f9d54604fdda9889 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
HDFS is single-writer, multiple-reader (see sec 8.3.1= of http://aosabook.org/en/hdf= s.html). You cannot have multiple writers for a single file at a time.=


O= n Thu, Jul 10, 2014 at 2:55 AM, rab ra <rabmdu@gmail.com> wro= te:
Hello
=C2=A0

I have one use-case t= hat spans multiple map tasks in hadoop environment. I use hadoop 1.2.1 and = with 6 task nodes. Each map task writes their output into a file stored in = hdfs. This file is shared across all the map tasks. Though, they all comput= es thier output but some of them are missing in the output file.

=C2=A0

The output file is an excel file with 8 parameters(headin= gs). Each map task is supposed to compute all these 8 values, and save it a= s soon as it is computed. This means, the programming logic of a map task o= pens the file, writes the value and close, 8 times.

=C2=A0

Can someone give me a hint on whats going wrong here?
=
=C2=A0

Is it possible to make more than one map task to write in= a shared file in HDFS?

Regards
Rab


CONFIDENTIALITY NOTICE
NOTICE: This message is = intended for the use of the individual or entity to which it is addressed a= nd may contain information that is confidential, privileged and exempt from= disclosure under applicable law. If the reader of this message is not the = intended recipient, you are hereby notified that any printing, copying, dis= semination, distribution, disclosure or forwarding of this communication is= strictly prohibited. If you have received this communication in error, ple= ase contact the sender immediately and delete it from your system. Thank Yo= u. --089e01184956f9d54604fdda9889--