hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-837) virtual column support (filename) in hive
Date Wed, 14 Oct 2009 05:04:31 GMT

    [ https://issues.apache.org/jira/browse/HIVE-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765394#action_12765394
] 

He Yongqiang commented on HIVE-837:
-----------------------------------

Support filename as a virtual column is more easier for file pruning, and a udf called filename()
will allow capturing file name in query.
i think both are very useful and we should support both.

> virtual column support (filename) in hive
> -----------------------------------------
>
>                 Key: HIVE-837
>                 URL: https://issues.apache.org/jira/browse/HIVE-837
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>
> Copying from some mails:
> I am dumping files into a hive partion on five minute intervals. I am using LOAD DATA
into a partition.
> weblogs
> web1.00
> web1.05
> web1.10
> ...
> web2.00
> web2.05
> web1.10
> ....
> Things that would be useful..
> Select files from the folder with a regex or exact name
> select * FROM logs where FILENAME LIKE(WEB1*)
> select * FROM LOGS WHERE FILENAME=web2.00
> Also it would be nice to be able to select offsets in a file, this would make sense with
appends
> select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]
> select  
> substr(filename, 4, 7) as  class_A, 
> substr(filename,  8, 10) as class_B
> count( x ) as cnt
> from FOO
> group by
> substr(filename, 4, 7), 
> substr(filename,  8, 10) ;
> Hive should support virtual columns

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message