hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: [jira] Updated: (HIVE-1272) Add SymlinkTextInputFormat to Hive
Date Wed, 31 Mar 2010 13:46:36 GMT
In getTargetPathsFromSymlinksDirs(), should we check whether each target
path exists ?

On Tue, Mar 30, 2010 at 6:50 PM, Guanghao Shen (JIRA) <jira@apache.org>wrote:

>
>     [
> https://issues.apache.org/jira/browse/HIVE-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
>
> Guanghao Shen updated HIVE-1272:
> --------------------------------
>
>    Attachment: HIVE-1272.1.patch
>
> This input format parses the symlink files and generates input split using
> parsed target paths.
>
> Symlink file is a text file which contains a list of target filename /
> dirname. The provided map input path will contain symlink files.
>
> > Add SymlinkTextInputFormat to Hive
> > ----------------------------------
> >
> >                 Key: HIVE-1272
> >                 URL: https://issues.apache.org/jira/browse/HIVE-1272
> >             Project: Hadoop Hive
> >          Issue Type: New Feature
> >    Affects Versions: 0.5.0
> >            Reporter: Zheng Shao
> >            Assignee: Guanghao Shen
> >         Attachments: HIVE-1272.1.patch
> >
> >
> > We'd like to add a symlink text input format so that we can specify the
> list of files for a table/partition based on the content of a text file.
> > For example, the location of the table is "/user/hive/mytable".
> > There is a file called "/user/hive/mytable/myfile.txt".
> > Inside the file, there are 2 lines, "/user/myname/textfile1.txt" and
> "/user/myname/textfile2.txt"
> > We can do:
> > {code}
> > CREATE TABLE mytable (...) STORED AS INPUTFORMAT
> 'org.apache.hadoop.hive.io.SymlinkTextInputFormat' LOCATION
> '/user/hive/mytable';
> > SELECT * FROM mytable;
> > {code}
> > which will return the content of the 2 files:
> "/user/myname/textfile1.txt" and "/user/myname/textfile2.txt"
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message