Return-Path: Delivered-To: apmail-pig-user-archive@www.apache.org Received: (qmail 33841 invoked from network); 8 Dec 2010 21:53:58 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Dec 2010 21:53:58 -0000 Received: (qmail 56603 invoked by uid 500); 8 Dec 2010 21:53:57 -0000 Delivered-To: apmail-pig-user-archive@pig.apache.org Received: (qmail 56576 invoked by uid 500); 8 Dec 2010 21:53:57 -0000 Mailing-List: contact user-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@pig.apache.org Delivered-To: mailing list user@pig.apache.org Received: (qmail 56568 invoked by uid 99); 8 Dec 2010 21:53:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Dec 2010 21:53:57 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [72.0.70.51] (HELO melon.org) (72.0.70.51) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Dec 2010 21:53:52 +0000 Received: by melon.org (Postfix, from userid 1001) id 97D5134483A; Wed, 8 Dec 2010 16:53:30 -0500 (EST) Date: Wed, 8 Dec 2010 16:53:30 -0500 From: Kris Coward To: user@pig.apache.org Subject: IOException appearing during dump but not illustrate Message-ID: <20101208215330.GV17396@melon.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) Hi, I've recently gotten stumped by a problem where my attempts to dump the relations produced by a GROUP command give the following error (though illustrating the same relation works fine): java.io.IOException: Type mismatch in key from map: expected org.apache.pig.impl.io.NullableBytesWritable, recieved org.apache.pig.impl.io.NullableText at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:807) at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108) . . . for a little background, the relation that's failing is called y5, and is produced by the following string of commands (in grunt): y2 = foreach y1 generate $0 as timestamp, myudfs.httpArgParse($1) as argMap; y3 = foreach y2 generate argMap#'s' as uid, timestamp as timestamp; y4 = FILTER y3 BY (uid is not null); y5 = GROUP y4 BY uid; and to get an idea what sort of data is involved, ILLUSTRATE y4 yields: ----------------------------------------------------------------------------------------------------- | y1 | timestamp: int | args: bag({tuple_of_tokens: (token: chararray)}) | ----------------------------------------------------------------------------------------------------- | | 1265950806 | {(s=1381688313), (u=F68FFA1F655FDF494ABA520D95E1D99E), (ts=1265950805)} | ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------- | y2 | timestamp: int | argMap: map | ----------------------------------------------------------------------------------------------- | | 1265950806 | {u=F68FFA1F655FDF494ABA520D95E1D99E, ts=1265950805, s=1381688313} | ----------------------------------------------------------------------------------------------- -------------------------------------------- | y3 | uid: bytearray | timestamp: int | -------------------------------------------- | | 1381688313 | 1265950806 | -------------------------------------------- -------------------------------------------- | y4 | uid: bytearray | timestamp: int | -------------------------------------------- | | 1381688313 | 1265950806 | -------------------------------------------- The same problem was also produced when the FILTER command was omitted, and the relevant chunk of code in myudfs.httpArgParse is: StringTokenizer tok = new StringTokenizer((String)pair, "=", false); if (tok.hasMoreTokens() ) { String oKey = tok.nextToken(); if (tok.hasMoreTokens() ) { Object oValue = tok.nextToken(); output.put(oKey, oValue); } else { output.put(oKey, null); } } If anyone has any insight how I could get this to work, that'd really help me out. Thanks, Kris P.S. For those who remember my earlier post about getting httpArgParse to compile, I took the advice to ditch the InternalMap in favour of a HashMap -- Kris Coward http://unripe.melon.org/ GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3