hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy Ks <bejoy...@yahoo.com>
Subject Re: Database/Schema , INTERVAL and SQL IN usages in Hive
Date Wed, 23 Feb 2011 11:39:59 GMT
Ajo,
    If we have a good number of elements in the comparison set then going for a 
table would be beneficial. But in case of a few elements say 5 wont multiple '=' 
be better?

Regards 
Bejoy KS




________________________________
From: Ajo Fod <ajo.fod@gmail.com>
To: user@hive.apache.org
Sent: Mon, February 21, 2011 10:04:41 PM
Subject: Re: Database/Schema , INTERVAL and SQL IN usages in Hive

On using SQL IN ... what would happen if you created a short table with the 
enteries in the IN clause and used a "inner join" ?

-Ajo


On Mon, Feb 21, 2011 at 7:57 AM, Bejoy Ks <bejoy_ks@yahoo.com> wrote:

Thanks Jov for the quick response
>
>Could you please let me know which is the latest stable version of hive. Also 
>how would you find out your hive version from command line?
>
>Regarding the SQL IN  I'm also currently using multiple '=' in my jobs, but 
>still wanted to know whether there would be some better usage for the same apart 
>from this. 
>
>
>
>Regards
>Bejoy KS
>
>
>
>
>
>
>
________________________________
 From: Jov <zhao6014@gmail.com>
>To: user@hive.apache.org
>Sent: Mon, February 21, 2011 9:09:34 PM
>Subject: Re: Database/Schema , INTERVAL and SQL IN usages in Hive
>
>
>
>
>在 2011-2-21 下午10:54,"Bejoy Ks" <bejoy_ks@yahoo.com>写道:
>>
>> Hi Experts
>>      I'm using hive for a few projects and i found it a great tool in hadoop to 
>>process end to end structured data. Unfortunately I'm facing a few challenges 
>>out here as follows
>>
>> Availability of database/schemas in Hive
>> I'm having multiple projects running in hive each having fairly large number of 
>>tables. With this much tables all together it is  looking a bit  messed up. Is 
>>there any option of creating database/schema in Hive so that I can maintain the 
>>tables in different databases/schemas corresponding to each project.
>it seems the resent version has already support database ddl,so,you can use 
>create database. 
>
>> Using INTERVAL 
>>     I need to replicate a job running in Teradata edw into hive, i'm facing a 
>>challenge out here.Not able to identify a similar usage corresponding to 
>>Interval in teradata within hive. Here is the snippet where I'm facing the issue
>>  *** where 1.seq_id = r4.seq_id and r4.mc_datetime >= (r1.rc_datetime + 
>>INTERVAL '05' HOUR)
>> In this query how do i replicate the last part in hive ie (r1.rc_datetime + 
>>INTERVAL '05' HOUR) , where it is adding 5 hours to the obtained time stamp 
>>rc_datetime.
>> *The where condition is part of a very large query involving multiple table 
>>joins.
>hive do not have date or timestamp data type,all such type is string,but you can 
>write your udf to implement similar function 
>
>>
>> Using IN 
>>     How do we replicate the SQL IN function in hive
>> ie *** where R1.seq_id = r4.seq_id and r1.PROCCESS_PHASE IN ( 'Production', 
>>'Stage' , 'QA', 'Development')
>> the last part of the query is where i'm facing the challenge r1.PROCCESS_PHASE 
>>IN ( 'Production', 'Stage' , 'QA', 'Development')
>> *The where condition is part of a very large query involving multiple table 
>>joins.
>you can use or,e.g.
>'x in(1,2)' can be 'x=1 or x=2'
>> Please advise.
>>
>> Regards
>> Bejoy KS
>>
>>
>>
>>
>>
>>
>>
>
>



      
Mime
View raw message