hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy Ks <>
Subject Database/Schema , INTERVAL and SQL IN usages in Hive
Date Mon, 21 Feb 2011 14:53:58 GMT
Hi Experts
     I'm using hive for a few projects and i found it a great tool in hadoop to 
process end to end structured data. Unfortunately I'm facing a few challenges 
out here as follows

Availability of database/schemas in Hive
I'm having multiple projects running in hive each having fairly large number of 
tables. With this much tables all together it is  looking a bit  messed up. Is 
there any option of creating database/schema in Hive so that I can maintain the 
tables in different databases/schemas corresponding to each project.

    I need to replicate a job running in Teradata edw into hive, i'm facing a 
challenge out here.Not able to identify a similar usage corresponding to 
Interval in teradata within hive. Here is the snippet where I'm facing the issue
 *** where 1.seq_id = r4.seq_id and r4.mc_datetime >= (r1.rc_datetime + INTERVAL 
'05' HOUR)
In this query how do i replicate the last part in hive ie (r1.rc_datetime + 
INTERVAL '05' HOUR) , where it is adding 5 hours to the obtained time stamp 
*The where condition is part of a very large query involving multiple table 

Using IN 
    How do we replicate the SQL IN function in hive
ie *** where R1.seq_id = r4.seq_id and r1.PROCCESS_PHASE IN ( 'Production', 
'Stage' , 'QA', 'Development')
the last part of the query is where i'm facing the challenge r1.PROCCESS_PHASE 
IN ( 'Production', 'Stage' , 'QA', 'Development')
*The where condition is part of a very large query involving multiple table 

Please advise.

Bejoy KS

View raw message