hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <ga...@yahoo-inc.com>
Subject Re: yahoo-specific pig - improvements to syntax and .pigrc
Date Thu, 14 Feb 2008 18:50:17 GMT
Craig,

Here's my thinking on this, though I don't speak for all of the 
committers.  Pig should have 3 ways to pick up configuration:

1) from .pigrc, as it does now
2) when embedded in another java program, the caller should be able to 
set values in PigContext, as I referred to in my response to Benjamin's 
email.
3) From the pig script, we should be able to something like:  set conf.x 
= y (I'm not necessarily suggesting syntax here).

With those in mind, to answer you're specific questions:

Craig Macdonald wrote:
> Hi pig-devs,
>
> Just a quiet ping for comments from project leads on this. It seems I 
> have raised several issues recently that require configuration of pig. 
> Questions:
> * Are System properties the best place for these?
We seem to be putting them in PigContext.conf for now, which is probably 
a fine place for them.
> * Should .pigrc evolve into a place for Pig aliases and properties, 
> and even scripts? (similar to .bashrc etc)
Right now you can store pig properties here.  It's not clear it needs to 
grow beyond that.  What use case do you see for storing aliases or 
scripts here?
> * Should new commands be added: import, include, sharedFS etc?
I'm guessing this is the same things as I'm saying in 3 above.  If not, 
please elaborate on what these new commands would do.
> * Please direct me as to how JIRAs should be created.
> I may be able to provide patches to some JIRAs I have created if we 
> have a policy for configuration-type stuff.
>
> C
>
> Craig Macdonald wrote:
>> Good morning Pig-devs,
>>
>> This email notes some of the yahoo specifics remaining in Pig that 
>> may be needed to checked before a Pig release (see 1. below). I would 
>> hope that Pig syntax can be evolved to allow these to be removed (see 
>> (2) below), and instead placed in users .pigrc.
>>
>> From PigContext, I note that the .pigrc is in fact a place for 
>> properties. An alternative would be for the Grunt set command to set 
>> System properties, and then make .pigrc into a pig script, allowing 
>> users to define aliases, register common jar files, import common 
>> namespaces, include other pig script files.
>>
>> Please direct how JIRAs should be created to track these issues - one 
>> issue for all, with subtasks; separate tasks for the three issues below?
>>
>> Details below.
>>
>> Craig
>>
>> 1. Yahoo specifics
>>
>> src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java
>>    doHod() contains yahoo specific stuff. I'm not sure Hod has 
>> stabalised sufficiently for this to be changed
>>    fixUpDomain() assumes unqualified hostnames are part of the 
>> .inktomisearch.com DNS domain
>>
>> src/org/apache/pig/impl/PigContext.java:125
>>        packageImportList.add("com.yahoo.pig.yst.sds.ULT.");
>>       Note - these is no Pig command to allow imports of package 
>> namespaces into the packageImportList ArrayList
>>
>> scripts/pig.pl
>>    kryptontite mentions, specifics: 69, 114
>>
>> 2. Extensions to Pig syntax
>> (a) "set" command sets all system properties
>> (b) "include" includes and parses another pig script
>> (c) "import" adds a package namespace to the search path
>>
>> 3. Change so that ~/.pigrc into a pig script that is parsed on 
>> startup of Grunt/PigServer?
>>
>

Mime
View raw message