+1.
--Yiping
On 7/6/09 10:58 AM, "Dmitriy Ryaboy" <dvryaboy@cloudera.com> wrote:
> +1 for standard semantics.
>
> We need a COALESCE function to go along with this.
>
> -D
>
> On Mon, Jul 6, 2009 at 10:46 AM, Olga Natkovich <olgan@yahoo-inc.com> wrote:
>
>> Hi,
>>
>>
>>
>> The current implementation of COUNT and AVG in Pig counts null values.
>> This is inconsistent with SQL semantics and also with semantics of other
>> aggregated functions such as SUM, MIN, and MAX. Originally we chose this
>> implementation for performance reasons; however, we re-implemented both
>> functions to support multi-step combiner and now the cost of checking
>> for null for the case where combiner is invoked is trivial. (I ran some
>> tests with COUNT and they showed no performance difference.) We will pay
>> penalty for the non-combinable case including local mode but I think it
>> is worth the price to have consistent semantics. Also as we are working
>> on SQL support, having SQL compliant semantics becomes very desirable.
>>
>>
>>
>> Please, let us know if you have any concerns. I am planning to make the
>> change later this week.
>>
>>
>>
>> Olga
>>
>>
----------------------
Yiping Han
F-3140
(408)349-4403
yhan@yahoo-inc.com
|