Yea please post to pig Jira, preferably with an example of how to reproduce the error (better yet a test that demonstrates the fix) On Dec 6, 2011, at 11:25 PM, Russell Jurney wrote: > I fixed the bug, in AvroStorageUtils.java: > > /** check whether it is just a wrapped tuple */ > public static boolean isTupleWrapper(ResourceFieldSchema pigSchema) { > System.err.println("is a wrapped tuple!"); > Boolean status = false; > if(pigSchema.getType() == DataType.TUPLE) > if(pigSchema.getName() != null) > > if(pigSchema.getName().equals(AvroStorageUtils.PIG_TUPLE_WRAPPER)) > status = true; > return status; > } > > The script now works. Will make a patch. Should I make a ticket? > > On Tue, Dec 6, 2011 at 5:36 PM, Dmitriy Ryaboy wrote: > >> If you send a pull to wilbur, I can merge it. But we are also still >> supporting piggybank as wilbur never really got off the ground... >> >> D >> >> On Tue, Dec 6, 2011 at 3:47 PM, Russell Jurney >> wrote: >>> I'm debugging the AvroStorage UDF in piggybank for this blog post: >>> >> http://datasyndrome.com/post/13707537045/booting-the-analytics-application-events-ruby >>> >>> The script is: >>> >>> messages = LOAD '/tmp/messages.avro' USING AvroStorage(); >>> user_groups = GROUP messages by user_id; >>> per_user = FOREACH user_groups { >>> sorted = ORDER messages BY message_id DESC; >>> GENERATE group AS user_id, sorted AS messages; >>> } >>> DESCRIBE per_user >>>> per_user: {user_id: int,messages: {(message_id: int,topic: >>> chararray,user_id: int)}} >>> STORE per_user INTO '/tmp/per_user.avro' USING AvroStorage(); >>> >>> The error is: >>> >>> Pig Stack Trace >>> --------------- >>> ERROR 1002: Unable to store alias per_user >>> >>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to >>> store alias per_user >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1596) >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:584) >>> at >> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) >>> at >>> >> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) >>> at >>> >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) >>> at >>> >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) >>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67) >>> at org.apache.pig.Main.run(Main.java:487) >>> at org.apache.pig.Main.main(Main.java:108) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> Caused by: java.lang.NullPointerException >>> at >>> >> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.isTupleWrapper(AvroStorageUtils.java:327) >>> at >>> >> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Avro.java:82) >>> at >>> >> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Avro.java:105) >>> at >>> >> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convertRecord(PigSchema2Avro.java:151) >>> at >>> >> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Avro.java:62) >>> at >>> >> org.apache.pig.piggybank.storage.avro.AvroStorage.checkSchema(AvroStorage.java:502) >>> at >>> >> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:65) >>> at >> org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77) >>> at >>> >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) >>> at >>> >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) >>> at >>> >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) >>> at >>> >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) >>> at >>> >> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) >>> at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) >>> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) >>> at >>> >> org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45) >>> at >>> >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:292) >>> at org.apache.pig.PigServer.compilePp(PigServer.java:1360) >>> at >> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1297) >>> at org.apache.pig.PigServer.execute(PigServer.java:1286) >>> at org.apache.pig.PigServer.access$400(PigServer.java:125) >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1591) >>> ... 13 more >>> >>> >>> I need to fix this. Which means I need to commit a patch to get in the >>> current piggybank? I've got some time... is it worthwhile to resurrect >>> wilbur on github and move piggybank over? >>> >>> -- >>> Russell Jurney >>> twitter.com/rjurney >>> russell.jurney@gmail.com >>> datasyndrome.com >> > > > > -- > Russell Jurney > twitter.com/rjurney > russell.jurney@gmail.com > datasyndrome.com