hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1548) Optimize scalar to consolidate the part file
Date Sun, 05 Sep 2010 05:29:34 GMT

    [ https://issues.apache.org/jira/browse/PIG-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906321#action_12906321

Daniel Dai commented on PIG-1548:

Patch break TestFRJoin2.testConcatenateJobForScalar3. Comment out TestFRJoin2.testConcatenateJobForScalar3

> Optimize scalar to consolidate the part file
> --------------------------------------------
>                 Key: PIG-1548
>                 URL: https://issues.apache.org/jira/browse/PIG-1548
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Richard Ding
>             Fix For: 0.8.0
>         Attachments: PIG-1548.patch, PIG-1548_1.patch
> Current scalar implementation will write a scalar file onto dfs. When Pig need the scalar,
it will open the dfs file directly. Each scalar file contains more than one part file though
it contains only one record. This puts a huge load to namenode. We should consolidate part
file before open it. Another optional step is put the consolicated file into distributed cache.
This further bring down the load of namenode.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message