Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C0892DB76 for ; Mon, 1 Oct 2012 20:02:14 +0000 (UTC) Received: (qmail 44071 invoked by uid 500); 1 Oct 2012 20:02:09 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 43937 invoked by uid 500); 1 Oct 2012 20:02:09 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 43925 invoked by uid 99); 1 Oct 2012 20:02:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Oct 2012 20:02:09 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.220.176] (HELO mail-vc0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Oct 2012 20:02:03 +0000 Received: by vcbgb22 with SMTP id gb22so7752169vcb.35 for ; Mon, 01 Oct 2012 13:01:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=Y2kR7IvSJIRJQnFGrZ1Kn2P5Ux1lKwsKXrsJjQBS7yE=; b=F/iF5em2RnCyFElL6E/N/lR205Ab6Hdc/lvLtcqzmj9PY4sCsdz+HtvVql8v+uvaC+ m82K7XJBNp/4HNPoSrNOVSB2FFqZ0rtIl/584VWKgIPWxLygm7aSD/wTxk+R3SZuYC2H iJ8wwg+0kSZUBSI6QMYaKcaHIAHvWptfo0fEiRR03Vme+MTTIiFRV146tYdVRj3426gq twXpT9bOx93AHCDLFov1YL+WH5xckZR+8VW0hKz3JVQXter0hJeCQG5ZXA9YfKdjX0J2 B2zufYjHJ6+7iahad/0IWZpLf4cX+avCwX1Rs9FgcckSFf9x7tL8BOs5u7VaKVtO1TOI 7u0Q== MIME-Version: 1.0 Received: by 10.52.95.46 with SMTP id dh14mr7258515vdb.114.1349121701901; Mon, 01 Oct 2012 13:01:41 -0700 (PDT) Received: by 10.58.124.227 with HTTP; Mon, 1 Oct 2012 13:01:41 -0700 (PDT) In-Reply-To: <03503a442b790fbb48303b88f2f05855@cs.uni-kassel.de> References: <5069C0B0.2090201@cs.uni-kassel.de> <03503a442b790fbb48303b88f2f05855@cs.uni-kassel.de> Date: Mon, 1 Oct 2012 13:01:41 -0700 Message-ID: Subject: Re: HDFS "file" missing a part-file From: Robert Molina To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf3071cc68ccf67e04cb04dfc5 X-Gm-Message-State: ALoCoQmj2ZZSN2IovkbXpbrmBJC5bqVPUzCScYvHdlsl5Yu4qRoc7XS9YFG3KiYeK6lCANOgXy6W --20cf3071cc68ccf67e04cb04dfc5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It seems that maybe the previous pig script didn't generate the output data or write correctly on hdfs. Can you provide the pig script you are trying to run? Also, for the original script that ran and generated the file, can you verify if that job had any failed tasks? On Mon, Oct 1, 2012 at 10:31 AM, Bj=F6rn-Elmar Macek = wrote: > > Hi Robert, > > the exception i see in the output of the grunt shell and in the pig log > respectively is: > > > Backend error message > --------------------- > java.util.EmptyStackException > at java.util.Stack.peek(Stack.**java:102) > at org.apache.pig.builtin.**Utf8StorageConverter.**consumeTuple(*= * > Utf8StorageConverter.java:182) > at org.apache.pig.builtin.**Utf8StorageConverter.**bytesToTuple(*= * > Utf8StorageConverter.java:501) > at org.apache.pig.backend.hadoop.**executionengine.physicalLayer.= * > *expressionOperators.POCast.**getNext(POCast.java:905) > at org.apache.pig.backend.hadoop.**executionengine.physicalLayer.= * > *PhysicalOperator.getNext(**PhysicalOperator.java:334) > at org.apache.pig.backend.hadoop.**executionengine.physicalLayer.= * > *relationalOperators.POForEach.**processPlan(POForEach.java:**332) > at org.apache.pig.backend.hadoop.**executionengine.physicalLayer.= * > *relationalOperators.POForEach.**getNext(POForEach.java:284) > at org.apache.pig.backend.hadoop.**executionengine.physicalLayer.= * > *PhysicalOperator.processInput(**PhysicalOperator.java:290) > at org.apache.pig.backend.hadoop.**executionengine.physicalLayer.= * > *relationalOperators.POForEach.**getNext(POForEach.java:233) > at org.apache.pig.backend.hadoop.**executionengine.** > mapReduceLayer.**PigGenericMapBase.runPipeline(** > PigGenericMapBase.java:271) > at org.apache.pig.backend.hadoop.**executionengine.** > mapReduceLayer.**PigGenericMapBase.map(**PigGenericMapBase.java:266) > at org.apache.pig.backend.hadoop.**executionengine.** > mapReduceLayer.**PigGenericMapBase.map(**PigGenericMapBase.java:64) > at org.apache.hadoop.mapreduce.**Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.**MapTask.runNewMapper(MapTask.** > java:764) > at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.**Child$4.run(Child.java:255) > at java.security.**AccessController.doPrivileged(**Native Method) > at javax.security.auth.Subject.**doAs(Subject.java:415) > at org.apache.hadoop.security.**UserGroupInformation.doAs(** > UserGroupInformation.java:**1121) > at org.apache.hadoop.mapred.**Child.main(Child.java:249) > > > > > > On Mon, 1 Oct 2012 10:12:22 -0700, Robert Molina > wrote: > >> Hi Bjorn, >> Can you post the exception you are getting during the map phase? >> >> On Mon, Oct 1, 2012 at 9:11 AM, Bj=F6rn-Elmar Macek wrote: >> >> Hi, >> >> i am kind of unsure where to post this problem, but i think it is >> more related to hadoop than to pig. >> >> By successfully executing a pig script i created a new file in my >> hdfs. Sadly though, i cannot use it for further processing except for >> "dump"ing and viewing the data: every data-manipulation script-command >> just as "foreach" gives exceptions during the map phase. >> Since there was no problem executing the same script on the first 100 >> lines of my data (LIMIT statement),i copied it to my local fs folder. >> What i realized is, that one of the files namely part-r-000001 was >> empty and contained within the _temporary folder. >> >> Is there any reason for this? How can i fix this issue? Did the job >> (which created the file we are talking about) NOT run properly til its >> end, although the tasktracker worked til the very end and the file was >> created? >> >> Best regards, >> Bj=F6rn >> >> >> >> Links: >> ------ >> [1] mailto:macek@cs.uni-kassel.de >> > > --20cf3071cc68ccf67e04cb04dfc5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It seems that maybe the previous pig script didn't generate the output = data or write correctly on hdfs. Can you provide the pig script you are try= ing to run? =A0Also, for the original script that ran and generated the fil= e, can you verify if that job had any failed tasks?


On Mon, Oct 1, 2012 at 10:31 = AM, Bj=F6rn-Elmar Macek <ema@cs.uni-kassel.de> wrote:
=

Hi Robert,

the exception i see in the output of the grunt shell and in the pig log res= pectively is:


Backend error message
---------------------
java.util.EmptyStackException
=A0 =A0 =A0 =A0 at java.util.Stack.peek(Stack.java:102)
=A0 =A0 =A0 =A0 at org.apache.pig.builtin.Utf8StorageConverter.consumeTuple(Utf8StorageConverter.java:182)
=A0 =A0 =A0 =A0 at org.apache.pig.builtin.Utf8StorageConverter.bytesToTuple(Utf8StorageConverter.java:501)
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.phy= sicalLayer.expressionOperators.POCast.getNext(POCast.java:905= )
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.phy= sicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334= )
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.phy= sicalLayer.relationalOperators.POForEach.processPlan(POForEac= h.java:332)
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.phy= sicalLayer.relationalOperators.POForEach.getNext(POForEach.ja= va:284)
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.phy= sicalLayer.PhysicalOperator.processInput(PhysicalOperator.jav= a:290)
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.phy= sicalLayer.relationalOperators.POForEach.getNext(POForEach.ja= va:233)
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.= mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMa= pBase.java:271)
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.= mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.ja= va:266)
=A0 =A0 =A0 =A0 at org.apache.pig.backend.hadoop.executionengine.= mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.ja= va:64)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.jav= a:144)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapred.MapTask.runNewMapper(Map= Task.java:764)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapred.MapTask.run(MapTask.java= :370)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapred.Child$4.run(Child.java:2= 55)
=A0 =A0 =A0 =A0 at java.security.AccessController.doPrivileged(Native Method)
=A0 =A0 =A0 =A0 at javax.security.auth.Subject.doAs(Subject.java:415= )
=A0 =A0 =A0 =A0 at org.apache.hadoop.security.UserGroupInformation.d= oAs(UserGroupInformation.java:1121)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapred.Child.main(Child.java:24= 9)





On Mon, 1 Oct 2012 10:12:22 -0700, Robert Molina <rmolina@hortonworks.com> wrot= e:
Hi Bjorn,=A0
Can you post the exception you are getting during the map phase?

On Mon, Oct 1, 2012 at 9:11 AM, Bj=F6rn-Elmar Macek =A0wrote:

=A0Hi,

=A0i am kind of unsure where to post this problem, but i think it is
more related to hadoop than to pig.

=A0By successfully executing a pig script i created a new file in my
hdfs. Sadly though, i cannot use it for further processing except for
"dump"ing and viewing the data: every data-manipulation script-co= mmand
just as "foreach" gives exceptions during the map phase.
=A0Since there was no problem executing the same script on the first 100 lines of my data (LIMIT statement),i copied it to my local fs folder.
=A0What i realized is, that one of the files namely part-r-000001 was
empty and contained within the _temporary folder.

=A0Is there any reason for this? How can i fix this issue? Did the job
(which created the file we are talking about) NOT run properly til its
end, although the tasktracker worked til the very end and the file was
created?

=A0Best regards,
=A0Bj=F6rn



Links:
------
[1] mailto:mace= k@cs.uni-kassel.de


--20cf3071cc68ccf67e04cb04dfc5--