hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gaurav Jain (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Date Wed, 10 Feb 2010 23:28:30 GMT

    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832287#action_12832287
] 

Gaurav Jain commented on PIG-1140:
----------------------------------


Few suggestions to the implementation


TableLoader: 
 -- In initialize method(), we sld do 
      
   Configuration conf = new Configuration(false) which creates an empty object. 
 
   Configuration conf = new Configuration() populates the object from default-*xml which may
contain conflicting properties. 
 
    ( Good to have ) 
 
 -- In seekNear method(), we might want to check the nullness of tableRecordReader. ( Good
to have ) 
 
 -- In createIndexReader(), since we set the projection, we sld not send null projection to

     createTableRecordReader(job, null). 
     It sld be createTableRecordReader(job, TableInoutFormat.getProjection(job)) (need to
have) 
 
 -- In setLocation() and getSchema(), if we are handling paths == null then we might want
to check paths.isEmpty() as well. (good to have) 
 
 
 
 
 TableStorer: 
 
 -- Instead of implementing new classes (TableOutputFormat and TableOutputCommitter), we sld
use BasicTableOutputFormat and BasicTableOutputFormat.TableOutputCommitter in zebra mapreduce
package ( must have ) 
 
                                   (There would be a separate jira/patch to do the same )

 
 -- Code from storeSchema sld go TableOutputFormat.TableOutputCommitter.cleanupJob(). 
 
 -- Does pig calls OutputCommitter.abortJob() for failed jobs ? 
 


> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to
its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message