hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Reed (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-70) Improve PigContext code by using Factory Pattern
Date Mon, 28 Jan 2008 16:21:34 GMT

    [ https://issues.apache.org/jira/browse/PIG-70?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563183#action_12563183

Benjamin Reed commented on PIG-70:

Benjamin, you are correct that PigContext is kind of a hodgepodge of necessary information.
I think most of it will be going away:

HOD is changing (AGAIN!). I haven't seen the latest direction, but it would probably be good
to move to the Hadoop specific package.

Also I the jobconf will move to the PlatformConfiguration Map. Once that happens PigContext
will look more like a singleton than a factory.

In a nutshell, I don't think we can finish this until PIG-66 and PIG-32 get in.

> Improve PigContext code by using Factory Pattern
> ------------------------------------------------
>                 Key: PIG-70
>                 URL: https://issues.apache.org/jira/browse/PIG-70
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.1.0
>            Reporter: Benjamin Francisoud
>         Attachments: PIG-70-v01.patch
> Even if the PigContext code is still quite small at the moment, for an outsider (me)
it's already hard to understand :(
> If I understand correctly the PigContext purpose (on a Object Oriented point of view)
is to hold various configuration objects like JobConf, JobClient, JobSubmissionProtocol...
> The initialization code mainly use the ExecType parameter but can also use quite complex
code like: doHod(), initProperties(), connect()...
> (btw, the connect() method is actually doing 2 things: initializing some var and trying
to connect, it initialization code should be move somewhere else)
> It is the perfect case to apply the [Factory Pattern|http://en.wikipedia.org/wiki/Factory_method_pattern]
, you can also see [Replace Constructor with Factory Method|http://www.refactoring.com/catalog/replaceConstructorWithFactoryMethod.html]
for more details.
> My proposal is to create a new PigContextFactory class, to old the initialization code
make to PigContext and PigContextFactory 200 lines classes instead of one big 500 lines of
code class.
> PigContext would hold some getter and setter and methods related to instantiate/run "functions"
> The new API would be:
> h4. PigContextFactory .java
> {code:java}
> public class PigContextFactory {
>     public static PigContext getInstance(ExecType execType) {...}
> }
> {code}
> h4. PigContext.java
> {code:java}
> public class PigContext implements Serializable, FunctionInstantiator {
>     public String getJobName(){...}
>     public JobSubmissionProtocol getJobTracker() {...}
>     public JobConf getConf() {...}
>     public static Object instantiateFuncFromSpec(String funcSpec) throws IOException{...}
>     public Object instantiateFuncFromAlias(String alias) throws IOException {...}
>     public void registerFunction(String function, String functionSpec) {...}
> }
> {code}
> h4. Client code
> {code:java}
> PigContext context = PigContextFactory.getInstance(ExecType.MAPREDUCE);
> {code}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message