hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Graham (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1174) Creation of output path should be done by storage function
Date Thu, 28 Jan 2010 20:21:35 GMT

    [ https://issues.apache.org/jira/browse/PIG-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806071#action_12806071

Bill Graham commented on PIG-1174:

Hi Olga,

You marked this bug as fixed, but from your comments it seems like instead
its a duplicate or child task of some other JIRA having to do with LSR. If
that's the case can you please mark it as such so we can have a JIRA to
track the progress of this work?


> Creation of output path should be done by storage function
> ----------------------------------------------------------
>                 Key: PIG-1174
>                 URL: https://issues.apache.org/jira/browse/PIG-1174
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Bill Graham
>             Fix For: 0.7.0
> When executing a STORE command, Pig creates the output location before the storage function
gets called. This causes problems with storage functions that have logic to determine the
output location. See this thread:
> http://www.mail-archive.com/pig-user%40hadoop.apache.org/msg01538.html
> For example, when making a request like this:
> STORE A INTO '/my/home/output' USING MultiStorage('/my/home/output','0', 'none', '\t');
> Pig creates a file '/my/home/output' and then an exception is thrown when MultiStorage
tries to make a directory under '/my/home/output'. The workaround is to instead specify a
dummy location as the first path like so:
> STORE A INTO '/my/home/output/temp' USING MultiStorage('/my/home/output','0', 'none',
> Two changes should be made:
> 1. The path specified in the INTO clause should be available to the storage function
so it doesn't need to be duplicated.
> 2. The creation of the output paths should be delegated to the storage function.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message