hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Feature request: WHERE filename='x' or filename='y'
Date Thu, 17 Sep 2009 14:41:12 GMT
On Wed, Sep 16, 2009 at 10:42 PM, 김영우 <warwithin@gmail.com> wrote:
> Hi Edward,
>
> It would be nice and very useful. sometimes I want to select my own
> 'partition' or 'datafile' explicitly.. something like below:
>
> SELECT *  FROM weblogs PARTITION ('2009-09-17', '2009-09-18') WHERE
> col1='..' and col2= ...
>
> Or users can select data files from directory:
>
> SELECT *  FROM weblogs DATAFILE ('log1.txt', 'log2.txt') WHERE col1='..' and
> col2= ...
>
> Anyway, your idea is very cool!
>
> Youngwoo
>
> 2009/9/17 Edward Capriolo <edlinuxguru@gmail.com>
>>
>> I am dumping files into a hive partion on five minute intervals. I am
>> using LOAD DATA into a partition.
>>
>> weblogs
>> web1.00
>> web1.05
>> web1.10
>> ...
>> web2.00
>> web2.05
>> web1.10
>> ....
>>
>> Things that would be useful..
>>
>> Select files from the folder with a regex or exact name
>>
>> select * FROM logs where FILENAME LIKE(WEB1*)
>>
>> select * FROM LOGS WHERE FILENAME=web2.00
>>
>> Also it would be nice to be able to select offsets in a file, this
>> would make sense with appends
>>
>> select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]
>>
>> Do these make sense to anyone?
>>
>> Edward
>
>
I added your comments to
https://issues.apache.org/jira/browse/HIVE-837
Depending on how you are setup you can do this with a where clause

SELECT * FROM weblogs PARTITION ('2009-09-17'

For example I partion by date and by hour
partition (log_date_part string, log_hour_part string)

select * from table where log_date_part like ('2009%')

or

select * from table where log_date_part = '2009-05-05' OR
log_date_part = '2009-05-06'

So you should be able to do that already.

Mime
View raw message