hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guanghao Shen (JIRA)" <>
Subject [jira] Commented: (HIVE-1272) Add SymlinkTextInputFormat to Hive
Date Wed, 31 Mar 2010 22:56:27 GMT


Guanghao Shen commented on HIVE-1272:

@Ted: In current implementation, duplicated filenames will lead to duplicated input data.
I thought that if duplicated input data is not what desired, the filename should not appear
in symlink more than once at first place. No?

> Add SymlinkTextInputFormat to Hive
> ----------------------------------
>                 Key: HIVE-1272
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: New Feature
>    Affects Versions: 0.5.0
>            Reporter: Zheng Shao
>            Assignee: Guanghao Shen
>         Attachments: HIVE-1272.1.patch
> We'd like to add a symlink text input format so that we can specify the list of files
for a table/partition based on the content of a text file.
> For example, the location of the table is "/user/hive/mytable".
> There is a file called "/user/hive/mytable/myfile.txt".
> Inside the file, there are 2 lines, "/user/myname/textfile1.txt" and "/user/myname/textfile2.txt"
> We can do:
> {code}
LOCATION '/user/hive/mytable';
> SELECT * FROM mytable;
> {code}
> which will return the content of the 2 files: "/user/myname/textfile1.txt" and "/user/myname/textfile2.txt"

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message