hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gesli, Nicole" <Nicole.Ge...@memorylane.com>
Subject Re: SQL help
Date Thu, 24 May 2012 22:10:33 GMT
Try this:

SELECT a.a_id, b.path
FROM ( SELECT a_id, MIN(t_timestamp) t_timestamp
       FROM   web_data
       GROUP BY a_id
     ) a JOIN
       web_data b ON ( b.a_id = a.a_id AND b.t_timestamp = a.t_timestamp )

-Nicole

From: Roberto Sanabria <roberto@stumbleupon.com<mailto:roberto@stumbleupon.com>>
Reply-To: <user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Thu, 24 May 2012 15:06:29 -0700
To: <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: SQL help

"I guess do some kind of group by and store it in intermediate file and run another select
on it?"

Yes, that is my recommendation.

On Thu, May 24, 2012 at 2:57 PM, Mohit Anchlia <mohitanchlia@gmail.com<mailto:mohitanchlia@gmail.com>>
wrote:


On Thu, May 24, 2012 at 2:19 PM, Edward Capriolo <edlinuxguru@gmail.com<mailto:edlinuxguru@gmail.com>>
wrote:
Hive is not SQL 92 compliant or whatever.

https://cwiki.apache.org/Hive/languagemanual.html

in particular you can not do subselects inside the in or the where
clause. Hive usually have other formulations like left semi join that
makes things 'like in' and 'not in' possible.

Thanks. But what I am looking for is to select only those rows that are of min(t_timestamp)
for a given a_id. What would be the best way? I guess do some kind of group by and store it
in intermediate file and run another select on it?

Edward
On Thu, May 24, 2012 at 5:13 PM, Mohit Anchlia <mohitanchlia@gmail.com<mailto:mohitanchlia@gmail.com>>
wrote:
> I am now trying to do it this way but doesn't work in hive. I think I am
> missing something here, can someone please help?
>
> select a_id from web_data t1 where a_id = (select min(a_id) from web_data t2
> where t2.t_timestamp = t1.t_timestamp)
>
> I get:
>
>
> FAILED: Parse Error: line 1:69 cannot recognize input near 'select' 'min'
> '(' in expression specification
>
>
>
> On Thu, May 24, 2012 at 1:02 PM, Mohit Anchlia <mohitanchlia@gmail.com<mailto:mohitanchlia@gmail.com>>
> wrote:
>>
>> I am new to Hive. I have several SQL from RDBMS database that I need to
>> convert to hive. What's the best reference for HIVEQL? For now I am trying
>> to figure out how to do this in hive:
>>
>> Select  distinct A_ID, First_Value(path IGNORE NULLS) over(PARTITION BY
>> A_ID ORDER BY t_timestamp) From WEB_DATA
>>
>> Any help would be appreciated.
>
>
>



Mime
View raw message