hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carl Steinbach (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-837) virtual column support (filename) in hive
Date Mon, 14 Dec 2009 18:53:19 GMT

    [ https://issues.apache.org/jira/browse/HIVE-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790287#action_12790287
] 

Carl Steinbach commented on HIVE-837:
-------------------------------------

I agree. I think most people will want to use this in conjunction with external tables, and
in that
case they are really interested in the actual filename as opposed to the directory name.

> virtual column support (filename) in hive
> -----------------------------------------
>
>                 Key: HIVE-837
>                 URL: https://issues.apache.org/jira/browse/HIVE-837
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>
> Copying from some mails:
> I am dumping files into a hive partion on five minute intervals. I am using LOAD DATA
into a partition.
> weblogs
> web1.00
> web1.05
> web1.10
> ...
> web2.00
> web2.05
> web1.10
> ....
> Things that would be useful..
> Select files from the folder with a regex or exact name
> select * FROM logs where FILENAME LIKE(WEB1*)
> select * FROM LOGS WHERE FILENAME=web2.00
> Also it would be nice to be able to select offsets in a file, this would make sense with
appends
> select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]
> select  
> substr(filename, 4, 7) as  class_A, 
> substr(filename,  8, 10) as class_B
> count( x ) as cnt
> from FOO
> group by
> substr(filename, 4, 7), 
> substr(filename,  8, 10) ;
> Hive should support virtual columns

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message