From dev-return-25222-apmail-pig-dev-archive=pig.apache.org@pig.apache.org Wed Dec 7 07:33:44 2011 Return-Path: X-Original-To: apmail-pig-dev-archive@www.apache.org Delivered-To: apmail-pig-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 28DD8746A for ; Wed, 7 Dec 2011 07:33:44 +0000 (UTC) Received: (qmail 80362 invoked by uid 500); 7 Dec 2011 07:33:43 -0000 Delivered-To: apmail-pig-dev-archive@pig.apache.org Received: (qmail 80273 invoked by uid 500); 7 Dec 2011 07:33:43 -0000 Mailing-List: contact dev-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list dev@pig.apache.org Received: (qmail 80264 invoked by uid 99); 7 Dec 2011 07:33:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Dec 2011 07:33:42 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dvryaboy@gmail.com designates 209.85.160.177 as permitted sender) Received: from [209.85.160.177] (HELO mail-gy0-f177.google.com) (209.85.160.177) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Dec 2011 07:33:36 +0000 Received: by ghrr19 with SMTP id r19so286234ghr.22 for ; Tue, 06 Dec 2011 23:33:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=references:in-reply-to:mime-version:content-transfer-encoding :content-type:message-id:cc:x-mailer:from:subject:date:to; bh=hoVS/ozAwwFQaM2OQoNa/6fxSf8NZjkI36sU3VH4ANw=; b=s/4zQ91S8lNOgTw5A2UkQn1Ape7jrZ4nPE+wh7jybBXJ9IoaSbqYB5bVyUysjSv1SG NtLeN5mE2QGhySJnabPJsTl92FzwaYAVsKvVddFCk/WmnhSF0ofzpMl8uKXPJ2hkWbnZ SPXx00fAvFjuDtAlf7P+HIYQhVKl5rmXjgC9s= Received: by 10.101.187.3 with SMTP id o3mr3915337anp.55.1323243194837; Tue, 06 Dec 2011 23:33:14 -0800 (PST) Received: from [192.168.1.109] (c-98-210-192-135.hsd1.ca.comcast.net. [98.210.192.135]) by mx.google.com with ESMTPS id o50sm1719275yhl.9.2011.12.06.23.33.13 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 06 Dec 2011 23:33:14 -0800 (PST) References: In-Reply-To: Mime-Version: 1.0 (1.0) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Message-Id: Cc: "dev@pig.apache.org" X-Mailer: iPhone Mail (9A334) From: Dmitriy Ryaboy Subject: Re: Wilbur, AvroStorage Date: Tue, 6 Dec 2011 23:33:10 -0800 To: "dev@pig.apache.org" Yea please post to pig Jira, preferably with an example of how to reproduce t= he error (better yet a test that demonstrates the fix) On Dec 6, 2011, at 11:25 PM, Russell Jurney wrote= : > I fixed the bug, in AvroStorageUtils.java: >=20 > /** check whether it is just a wrapped tuple */ > public static boolean isTupleWrapper(ResourceFieldSchema pigSchema) { > System.err.println("is a wrapped tuple!"); > Boolean status =3D false; > if(pigSchema.getType() =3D=3D DataType.TUPLE) > if(pigSchema.getName() !=3D null) >=20 > if(pigSchema.getName().equals(AvroStorageUtils.PIG_TUPLE_WRAPPER)) > status =3D true; > return status; > } >=20 > The script now works. Will make a patch. Should I make a ticket? >=20 > On Tue, Dec 6, 2011 at 5:36 PM, Dmitriy Ryaboy wrote:= >=20 >> If you send a pull to wilbur, I can merge it. But we are also still >> supporting piggybank as wilbur never really got off the ground... >>=20 >> D >>=20 >> On Tue, Dec 6, 2011 at 3:47 PM, Russell Jurney = >> wrote: >>> I'm debugging the AvroStorage UDF in piggybank for this blog post: >>>=20 >> http://datasyndrome.com/post/13707537045/booting-the-analytics-applicatio= n-events-ruby >>>=20 >>> The script is: >>>=20 >>> messages =3D LOAD '/tmp/messages.avro' USING AvroStorage(); >>> user_groups =3D GROUP messages by user_id; >>> per_user =3D FOREACH user_groups { >>> sorted =3D ORDER messages BY message_id DESC; >>> GENERATE group AS user_id, sorted AS messages; >>> } >>> DESCRIBE per_user >>>> per_user: {user_id: int,messages: {(message_id: int,topic: >>> chararray,user_id: int)}} >>> STORE per_user INTO '/tmp/per_user.avro' USING AvroStorage(); >>>=20 >>> The error is: >>>=20 >>> Pig Stack Trace >>> --------------- >>> ERROR 1002: Unable to store alias per_user >>>=20 >>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable t= o >>> store alias per_user >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1596) >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:584) >>> at >> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) >>> at >>>=20 >> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPars= er.java:386) >>> at >>>=20 >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:= 188) >>> at >>>=20 >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:= 164) >>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67) >>> at org.apache.pig.Main.run(Main.java:487) >>> at org.apache.pig.Main.main(Main.java:108) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>>=20 >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java= :39) >>> at >>>=20 >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> Caused by: java.lang.NullPointerException >>> at >>>=20 >> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.isTupleWrapper(Avr= oStorageUtils.java:327) >>> at >>>=20 >> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Av= ro.java:82) >>> at >>>=20 >> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Av= ro.java:105) >>> at >>>=20 >> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convertRecord(PigSch= ema2Avro.java:151) >>> at >>>=20 >> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Av= ro.java:62) >>> at >>>=20 >> org.apache.pig.piggybank.storage.avro.AvroStorage.checkSchema(AvroStorage= .java:502) >>> at >>>=20 >> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutput= FileVisitor.visit(InputOutputFileValidator.java:65) >>> at >> org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)= >>> at >>>=20 >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:= 64) >>> at >>>=20 >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:= 66) >>> at >>>=20 >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:= 66) >>> at >>>=20 >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:= 66) >>> at >>>=20 >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:= 66) >>> at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53= ) >>> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) >>> at >>>=20 >> org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(In= putOutputFileValidator.java:45) >>> at >>>=20 >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HE= xecutionEngine.java:292) >>> at org.apache.pig.PigServer.compilePp(PigServer.java:1360) >>> at >> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1297) >>> at org.apache.pig.PigServer.execute(PigServer.java:1286) >>> at org.apache.pig.PigServer.access$400(PigServer.java:125) >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1591) >>> ... 13 more >>>=20 >>>=20 >>> I need to fix this. Which means I need to commit a patch to get in the >>> current piggybank? I've got some time... is it worthwhile to resurrect >>> wilbur on github and move piggybank over? >>>=20 >>> -- >>> Russell Jurney >>> twitter.com/rjurney >>> russell.jurney@gmail.com >>> datasyndrome.com >>=20 >=20 >=20 >=20 > --=20 > Russell Jurney > twitter.com/rjurney > russell.jurney@gmail.com > datasyndrome.com