hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajo Fod <ajo....@gmail.com>
Subject Re: Database/Schema , INTERVAL and SQL IN usages in Hive
Date Wed, 23 Feb 2011 14:31:07 GMT
Better in what sense? ... if it is time you are concerned about there are in
memory joins.

-Ajo

On Wed, Feb 23, 2011 at 3:39 AM, Bejoy Ks <bejoy_ks@yahoo.com> wrote:

> Ajo,
>     If we have a good number of elements in the comparison set then going
> for a table would be beneficial. But in case of a few elements say 5 wont
> multiple '=' be better?
>
> Regards
> Bejoy KS
>
> ------------------------------
> *From:* Ajo Fod <ajo.fod@gmail.com>
>
> *To:* user@hive.apache.org
> *Sent:* Mon, February 21, 2011 10:04:41 PM
>
> *Subject:* Re: Database/Schema , INTERVAL and SQL IN usages in Hive
>
> On using SQL IN ... what would happen if you created a short table with the
> enteries in the IN clause and used a "inner join" ?
>
> -Ajo
>
> On Mon, Feb 21, 2011 at 7:57 AM, Bejoy Ks <bejoy_ks@yahoo.com> wrote:
>
>> Thanks Jov for the quick response
>>
>> Could you please let me know which is the latest stable version of hive.
>> Also how would you find out your hive version from command line?
>>
>> Regarding the SQL IN  I'm also currently using multiple '=' in my jobs,
>> but still wanted to know whether there would be some better usage for the
>> same apart from this.
>>
>>
>> Regards
>> Bejoy KS
>>
>>
>>
>> ------------------------------
>> *From:* Jov <zhao6014@gmail.com>
>> *To:* user@hive.apache.org
>> *Sent:* Mon, February 21, 2011 9:09:34 PM
>> *Subject:* Re: Database/Schema , INTERVAL and SQL IN usages in Hive
>>
>>
>> 在 2011-2-21 下午10:54,"Bejoy Ks" <bejoy_ks@yahoo.com>写道:
>> >
>> > Hi Experts
>> >      I'm using hive for a few projects and i found it a great tool in
>> hadoop to process end to end structured data. Unfortunately I'm facing a few
>> challenges out here as follows
>> >
>> > Availability of database/schemas in Hive
>> > I'm having multiple projects running in hive each having fairly large
>> number of tables. With this much tables all together it is  looking a bit
>> messed up. Is there any option of creating database/schema in Hive so that I
>> can maintain the tables in different databases/schemas corresponding to each
>> project.
>>
>> it seems the resent version has already support database ddl,so,you can
>> use create database.
>>
>> > Using INTERVAL
>> >     I need to replicate a job running in Teradata edw into hive, i'm
>> facing a challenge out here.Not able to identify a similar usage
>> corresponding to Interval in teradata within hive. Here is the snippet where
>> I'm facing the issue
>> >  *** where 1.seq_id = r4.seq_id and r4.mc_datetime >= (r1.rc_datetime +
>> INTERVAL '05' HOUR)
>> > In this query how do i replicate the last part in hive ie
>> (r1.rc_datetime + INTERVAL '05' HOUR) , where it is adding 5 hours to the
>> obtained time stamp rc_datetime.
>> > *The where condition is part of a very large query involving multiple
>> table joins.
>>
>> hive do not have date or timestamp data type,all such type is string,but
>> you can write your udf to implement similar function
>>
>> >
>> > Using IN
>> >     How do we replicate the SQL IN function in hive
>> > ie *** where R1.seq_id = r4.seq_id and r1.PROCCESS_PHASE IN (
>> 'Production', 'Stage' , 'QA', 'Development')
>> > the last part of the query is where i'm facing the challenge
>> r1.PROCCESS_PHASE IN ( 'Production', 'Stage' , 'QA', 'Development')
>> > *The where condition is part of a very large query involving multiple
>> table joins.
>>
>> you can use or,e.g.
>>
>> 'x in(1,2)' can be 'x=1 or x=2'
>>
>> > Please advise.
>> >
>> > Regards
>> > Bejoy KS
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>
>

Mime
View raw message