hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-1421) [Zebra] Pig script with Zebra data storage brings down name node due to excessive name node call.
Date Mon, 17 May 2010 19:06:43 GMT

     [ https://issues.apache.org/jira/browse/PIG-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xuefu Zhang updated PIG-1421:
-----------------------------

    Attachment: PIG-1421.patch

Fix includes:

1. Make setLocation() light weight and make sure no name node access. Note that setLocation()
was a new API on LoadFunc introduced in 0.7. UDFContext is used for some cases.
2. Remove code for setting properties (INPUT_FE and INPUT_DELETED_CGS) in TableInputFormat
because it's ineffective.
3. Move the logic in #2 to TableInputFormat.setInputPaths() and make sure that it's only done
once (Because setInputPaths() are called multiple times in PIG code path).
4. Remove unnecessary list status calls in  Zebra IO layer.
5. Remove the code that makes name node calls for sorted table in Pig code path.
6. Make sure that clob check is only done on the front end.

> [Zebra] Pig script with Zebra data storage brings down name node due to excessive name
node call.
> -------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1421
>                 URL: https://issues.apache.org/jira/browse/PIG-1421
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: PIG-1421.patch
>
>
> Because Pig call setLocation() on LoadFunc API on both frontent and backend, and Zebra
makes name node access in its implementation, name node becomes irresponsive because of the
number of name node calls.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message