pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Santhosh Srinivasan <...@yahoo-inc.com>
Subject RE: Does the name of the tuple that a bag has to have matter?
Date Mon, 21 Nov 2011 07:19:12 GMT
Its an implementation artifact of the old parser JavaCC in release prior to and including 0.8.
The new parser, as Alan points out, should not require this.


-----Original Message-----
From: Alan Gates [mailto:gates@hortonworks.com] 
Sent: Friday, November 18, 2011 9:00 AM
To: dev@pig.apache.org
Subject: Re: Does the name of the tuple that a bag has to have matter?

The name doesn't matter.  We mostly left it there for backward compatibility, for both specifying
schemas and for UDFs.  I do think we should make sure we ignore it everywhere (including equality
for schemas).


On Nov 16, 2011, at 7:17 PM, Jonathan Coveney wrote:

> This is related to an issue I'll probably be emailing about once I 
> isolate it, but I was curious what the philosophy is around the name 
> of the tuple that is in a bag.
> example:
> Schema s1 =
> Utils.getSchemaFromString("b:bag{t:tuple(name:chararray,age:int)}");
> In pig8, you had the whole two level access nonsense, so let's ignore that.
> In pig9, the tuple name seemed to be preserved, and would print with 
> toString.
> In trunk, the schema object throws away that name, and it doesn't print.
> I'm curious if there is any reason to keep it around, esp. given you 
> can just do Schema.equals(s1,s2,false,true) for equality without field 
> names, not to mention the fact that the name never really is going to 
> matter since a bag only has one element and it is a tuple.
> Thanks!
> Jon

View raw message