hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Verlangen <ro...@us2.nl>
Subject Re: Hive ignoring buckets when using dynamic where
Date Thu, 20 Sep 2012 14:00:05 GMT
Hi Bejoy,

Thank you for your reply. Is there any way to fix my problem? I want to
have a query that has a dynamic range, from now (and in some cases now - x
days until now).

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



2012/9/20 Bejoy KS <bejoy_ks@yahoo.com>

> Hi Robin
>
> The result of 'bdate=to_date(unix_timestamp())' is evaluated during the
> runtime of the query. But the data that a query should process is
> determined initially before executing the map reduce jobs. That is the
> reason the query is running over whole data set.
>
> When you provide 'bdate='2012-09-01'' the hive parser knows initially
> itself what data which all partitions should be taken into account. So this
> query runs on only the required partitions and not on whole data.
>
> To add on , it is not the buckets considered here on where clause but the
> partitions.
>
> Regards,
> Bejoy KS
>
>   ------------------------------
> *From:* Robin Verlangen <robin@us2.nl>
> *To:* user@hive.apache.org
> *Sent:* Thursday, September 20, 2012 5:06 PM
> *Subject:* Hive ignoring buckets when using dynamic where
>
> Hi there,
>
> We're working on some queries that use buckets to improve performance with
> like 1000x. However we ran into a problem. When we use a fixed hardcoded
> date it works fine:
>
> SELECT * FROM standard_feed WHERE bdate='2012-09-01'
> *Starts a job with 6 mappers, 2 reducers*
>
> When we use it dynamically:
> SELECT * FROM standard_feed WHERE bdate=to_date(unix_timestamp())
> *Starts a job with 1000 mappers, 2 reducers*
> *
> *
> What's the problem here? The result of the to_date of the current
> timestamp should be equal to a normal fixed date? Does anyone have a
> solution?
>
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
>

Mime
View raw message