hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hive user <hiveu...@gmail.com>
Subject Re: [jira] Commented: (HIVE-951) Selectively include EXTERNAL TABLE source files via REGEX
Date Mon, 30 Nov 2009 04:50:45 GMT
Avram,
Instead of asking... Why don't you develop yourself? I believe  
everyone on this list has a specific need.

Users should develop as well.  This is a DEV list...not a suggestion  
list.


Thx.

On Nov 25, 2009, at 13:42, "Avram Aelony (JIRA)" <jira@apache.org>  
wrote:

>
>    [ https://issues.apache.org/jira/browse/HIVE-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782634#action_12782634

>  ]
>
> Avram Aelony commented on HIVE-951:
> -----------------------------------
>
> Interesting.  As a user, I suppose I have no preference other than  
> what can be available sooner to fill the current need.
>
>> Selectively include EXTERNAL TABLE source files via REGEX
>> ---------------------------------------------------------
>>
>>                Key: HIVE-951
>>                URL: https://issues.apache.org/jira/browse/HIVE-951
>>            Project: Hadoop Hive
>>         Issue Type: Improvement
>>         Components: Query Processor
>>           Reporter: Carl Steinbach
>>
>> CREATE EXTERNAL TABLE should allow users to cherry-pick files via  
>> regular expression.
>> CREATE EXTERNAL TABLE was designed to allow users to access data  
>> that exists outside of Hive, and
>> currently makes the assumption that all of the files located under  
>> the supplied path should be included
>> in the new table. Users frequently encounter directories containing  
>> multiple
>> datasets, or directories that contain data in heterogeneous  
>> schemas, and it's often
>> impractical or impossible to adjust the layout of the directory to  
>> meet the requirements of
>> CREATE EXTERNAL TABLE. A good example of this problem is creating  
>> an external table based
>> on the contents of an S3 bucket.
>> One way to solve this problem is to extend the syntax of CREATE  
>> EXTERNAL TABLE
>> as follows:
>> CREATE EXTERNAL TABLE
>> ...
>> LOCATION path [file_regex]
>> ...
>> For example:
>> {code:sql}
>> CREATE EXTERNAL TABLE mytable1 ( a string, b string, c string )
>> STORED AS TEXTFILE
>> LOCATION 's3://my.bucket/' 'folder/2009.*\.bz2$';
>> {code}
>> Creates mytable1 which includes all files in s3:/my.bucket with a  
>> filename matching 'folder/2009*.bz2'
>> {code:sql}
>> CREATE EXTERNAL TABLE mytable2 ( d string, e int, f int, g int )
>> STORED AS TEXTFILE
>> LOCATION 'hdfs://data/' 'xyz.*2009????.bz2$';
>> {code}
>> Creates mytable2 including all files matching 'xyz*2009????.bz2'  
>> located under hdfs://data/
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>

Mime
View raw message