hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy Ks <bejoy...@yahoo.com>
Subject Re: inconsistent results when doing a select over a join
Date Mon, 09 Jan 2012 17:51:48 GMT
Hi Guy
        The easily possible option to nail down the root cause is divide and conquer.
You can try the following
-ensure the results are consistent on individual tables without joins

-try to narrow down the input to your join with a few ON condns
You can get whether it is an issue with code on a data quality issue. It could mostly be a
data quality issue.

Regards
Bejoy.K.S



________________________________
 From: Guy Doulberg <guy.doulberg@conduit.com>
To: user@hive.apache.org 
Sent: Monday, January 9, 2012 11:16 PM
Subject: Re: inconsistent results when doing a select over a join
 

Hey Dave,
I didn't understand your question,

The Inconsistant is slightly different, about 2% of differences, 

Thanks

Guy

On 01/09/2012 07:05 PM, David Houston wrote: 
Hi Guy,
>Inconsistant by way of the results are total off or the order is different?
>Thanks
>Dave
>On Jan 9, 2012 5:03 PM, "Guy Doulberg" <guy.doulberg@conduit.com> wrote:
>
>Hi guys,
>>
>>We are using hive for a while now, and recently we have
          encountered an issue we just can't understand,
>>
>>We are selecting(the select includes count(*)) over a join of
          two big tables.
>>
>>We ran the same query twice consequently over the same two
          tables , and each time the result were slightly different.
>>
>>We don't know how should we debug this issue, where should we
          look, any ideas?
>>
>>Thanks
>>
>>Guy Doulberg,
>>Data infrastructure engineer,
>>Conduit
>>
Mime
View raw message