hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy Ks <>
Subject Re: inconsistent results when doing a select over a join
Date Mon, 09 Jan 2012 17:51:48 GMT
Hi Guy
        The easily possible option to nail down the root cause is divide and conquer.
You can try the following
-ensure the results are consistent on individual tables without joins

-try to narrow down the input to your join with a few ON condns
You can get whether it is an issue with code on a data quality issue. It could mostly be a
data quality issue.


 From: Guy Doulberg <>
Sent: Monday, January 9, 2012 11:16 PM
Subject: Re: inconsistent results when doing a select over a join

Hey Dave,
I didn't understand your question,

The Inconsistant is slightly different, about 2% of differences, 



On 01/09/2012 07:05 PM, David Houston wrote: 
Hi Guy,
>Inconsistant by way of the results are total off or the order is different?
>On Jan 9, 2012 5:03 PM, "Guy Doulberg" <> wrote:
>Hi guys,
>>We are using hive for a while now, and recently we have
          encountered an issue we just can't understand,
>>We are selecting(the select includes count(*)) over a join of
          two big tables.
>>We ran the same query twice consequently over the same two
          tables , and each time the result were slightly different.
>>We don't know how should we debug this issue, where should we
          look, any ideas?
>>Guy Doulberg,
>>Data infrastructure engineer,
View raw message