From user-return-1730-apmail-hadoop-user-archive=hadoop.apache.org@hadoop.apache.org Mon Oct 1 17:32:18 2012 Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B41E3D6F2 for ; Mon, 1 Oct 2012 17:32:18 +0000 (UTC) Received: (qmail 71498 invoked by uid 500); 1 Oct 2012 17:32:14 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 71284 invoked by uid 500); 1 Oct 2012 17:32:14 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 71271 invoked by uid 99); 1 Oct 2012 17:32:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Oct 2012 17:32:14 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [141.51.167.101] (HELO gundel.cs.uni-kassel.de) (141.51.167.101) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Oct 2012 17:32:07 +0000 Received: from localhost (localhost [127.0.0.1]) by gundel.cs.uni-kassel.de (Postfix) with ESMTP id EF840209E8D; Mon, 1 Oct 2012 19:31:46 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at gundel.cs.uni-kassel.de Received: from gundel.cs.uni-kassel.de ([127.0.0.1]) by localhost (gundel.cs.uni-kassel.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SqyZ-WU6yBTE; Mon, 1 Oct 2012 19:31:41 +0200 (CEST) Received: by gundel.cs.uni-kassel.de (Postfix, from userid 33) id D46FB209EC3; Mon, 1 Oct 2012 19:31:40 +0200 (CEST) To: Subject: Re: HDFS "file" missing a part-file X-PHP-Originating-Script: 2154:func.inc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Mon, 01 Oct 2012 19:31:40 +0200 From: =?UTF-8?Q?Bj=C3=B6rn-Elmar_Macek?= Organization: =?UTF-8?Q?Universit=C3=A4t_Kassel_-_Fachgebiet_Wissensverar?= =?UTF-8?Q?beitung?= In-Reply-To: References: <5069C0B0.2090201@cs.uni-kassel.de> Message-ID: <03503a442b790fbb48303b88f2f05855@cs.uni-kassel.de> X-Sender: ema@cs.uni-kassel.de User-Agent: RoundCube Webmail/0.1b Hi Robert, the exception i see in the output of the grunt shell and in the pig log respectively is: Backend error message --------------------- java.util.EmptyStackException at java.util.Stack.peek(Stack.java:102) at org.apache.pig.builtin.Utf8StorageConverter.consumeTuple(Utf8StorageConverter.java:182) at org.apache.pig.builtin.Utf8StorageConverter.bytesToTuple(Utf8StorageConverter.java:501) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:905) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) On Mon, 1 Oct 2012 10:12:22 -0700, Robert Molina wrote: > Hi Bjorn,  > Can you post the exception you are getting during the map phase? > > On Mon, Oct 1, 2012 at 9:11 AM, Björn-Elmar Macek wrote: > Hi, > > i am kind of unsure where to post this problem, but i think it is > more related to hadoop than to pig. > > By successfully executing a pig script i created a new file in my > hdfs. Sadly though, i cannot use it for further processing except for > "dump"ing and viewing the data: every data-manipulation > script-command > just as "foreach" gives exceptions during the map phase. > Since there was no problem executing the same script on the first > 100 > lines of my data (LIMIT statement),i copied it to my local fs folder. > What i realized is, that one of the files namely part-r-000001 was > empty and contained within the _temporary folder. > > Is there any reason for this? How can i fix this issue? Did the job > (which created the file we are talking about) NOT run properly til > its > end, although the tasktracker worked til the very end and the file > was > created? > > Best regards, > Björn > > > > Links: > ------ > [1] mailto:macek@cs.uni-kassel.de