hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Hitchcock (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-1564) add support for multiple filesystems
Date Wed, 25 Aug 2010 01:52:23 GMT

     [ https://issues.apache.org/jira/browse/PIG-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Hitchcock updated PIG-1564:
----------------------------------

    Attachment: PIG-1564-1.patch

At the moment you can not say read from S3N and write to HDFS in the one job (or even read
from 1 S3N bucket and write to another). 
 
The essence of this patch is a change to the way HDataStorage works. Previously it mapped
to 1 Hadoop FileSystem object, which basically limited jobs to a single FileSystem. The change
is now that it is a wrapper around all Hadoop FileSystems, returning the correct one based
upon the prefix of the path being requested. 
 
Another small change was that previously Pig assumed the default home directory was '/user/<usename>'
on the default file system. This directory does not necessarily always exist, so I made this
configurable with a new property "pig.initial.fs.name".

> add support for multiple filesystems
> ------------------------------------
>
>                 Key: PIG-1564
>                 URL: https://issues.apache.org/jira/browse/PIG-1564
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Andrew Hitchcock
>         Attachments: PIG-1564-1.patch
>
>
> Currently you can't run Pig scripts that read data from one file system and write it
to another. Also, Grunt doesn't support CDing from one directory to another on different file
systems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message