Return-Path: Delivered-To: apmail-hadoop-hive-user-archive@minotaur.apache.org Received: (qmail 51236 invoked from network); 12 May 2010 02:44:39 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 May 2010 02:44:39 -0000 Received: (qmail 54078 invoked by uid 500); 12 May 2010 02:44:38 -0000 Delivered-To: apmail-hadoop-hive-user-archive@hadoop.apache.org Received: (qmail 53954 invoked by uid 500); 12 May 2010 02:44:38 -0000 Mailing-List: contact hive-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-user@hadoop.apache.org Delivered-To: mailing list hive-user@hadoop.apache.org Received: (qmail 53946 invoked by uid 99); 12 May 2010 02:44:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 May 2010 02:44:38 +0000 X-ASF-Spam-Status: No, hits=1.0 required=10.0 tests=AWL,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ted.xu.ml@gmail.com designates 209.85.221.173 as permitted sender) Received: from [209.85.221.173] (HELO mail-qy0-f173.google.com) (209.85.221.173) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 May 2010 02:44:31 +0000 Received: by qyk4 with SMTP id 4so106139qyk.21 for ; Tue, 11 May 2010 19:44:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=511gaUyXA8sR7qjD0t9H3q/gOsXkBgqAAB4jAk6yx9I=; b=evS0c6KFYMDCZG/jcDLgxSNHNpe7ZdT7a7RlFY/SjyhuBRZqs9QA0c831YIp5/sLhq mZa1L5kgn2ck6uGwGldrp31fyPuKTTTt5jAXCwIcWeHL0B60n8Y11DM31ruQ+vlXTMmO 6d0p4TsakDqLVYFIv88nuQf3/jepAVfVNFZzg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=AWSVI3EUKhmtAKp1UPhQ/NZmivF6RH7DPPZfjRqYB75PaqgtBhzLjpdxKioL7EAJrs TeTif+Nv65mIBAjeXkEQ5L+tl/XslaqH6n+ZAtgsnZyFjU4GKb/LZO5b5hZoMb2UFA2Z 9pi+48lBcVytqSWuOji52vwTeZwHsmuF3nCi4= MIME-Version: 1.0 Received: by 10.224.126.203 with SMTP id d11mr4586724qas.17.1273632250257; Tue, 11 May 2010 19:44:10 -0700 (PDT) Received: by 10.229.95.210 with HTTP; Tue, 11 May 2010 19:44:10 -0700 (PDT) Date: Wed, 12 May 2010 10:44:10 +0800 Message-ID: Subject: BUG at optimizer or map side aggregate? From: Ted Xu To: hive-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e6493f6ad9e37804865c9d86 --0016e6493f6ad9e37804865c9d86 Content-Type: text/plain; charset=ISO-8859-1 Hi all, I think I found a bug, I'm not sure whether the problem is at optimizer (PPD) or at map side aggregate. See query listed below: ------------------------------------- create table if not exists dm_fact_buyer_prd_info_d ( category_id string ,gmv_trade_num int ,user_id int ) PARTITIONED BY (ds int); set hive.optimize.ppd=true; set hive.map.aggr=true; explain select 20100426, category_id1,category_id2,assoc_idx from ( select category_id1 , category_id2 , count(distinct user_id) as assoc_idx from ( select t1.category_id as category_id1 , t2.category_id as category_id2 , t1.user_id from ( select category_id, user_id from dm_fact_buyer_prd_info_d where ds <= 20100426 and ds > 20100419 and category_id >0 and gmv_trade_num>0 group by category_id, user_id ) t1 join ( select category_id, user_id from dm_fact_buyer_prd_info_d where ds <= 20100426 and ds >20100419 and category_id >0 and gmv_trade_num >0 group by category_id, user_id ) t2 on t1.user_id=t2.user_id ) t1 group by category_id1, category_id2 ) t_o where category_id1 <> category_id2 and assoc_idx > 2; -------------------------------- The query above will fail when execute, throwing exception: "can not cast UDFOpNotEqual(Text, IntWritable) to UDFOpNotEqual(Text, Text)". I explained the query and the execute plan looks really wired (see the highlighted predicate): -------------------------------- ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF dm_fact_buyer_prd_info_d)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL category_id)) (TOK_SELEXPR (TOK_TABLE_OR_COL user_id))) (TOK_WHERE (and (and (and (<= (TOK_TABLE_OR_COL ds) 20100426) (> (TOK_TABLE_OR_COL ds) 20100419)) (> (TOK_TABLE_OR_COL category_id) 0)) (> (TOK_TABLE_OR_COL gmv_trade_num) 0))) (TOK_GROUPBY (TOK_TABLE_OR_COL category_id) (TOK_TABLE_OR_COL user_id)))) t1) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF dm_fact_buyer_prd_info_d)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL category_id)) (TOK_SELEXPR (TOK_TABLE_OR_COL user_id))) (TOK_WHERE (and (and (and (<= (TOK_TABLE_OR_COL ds) 20100426) (> (TOK_TABLE_OR_COL ds) 20100419)) (> (TOK_TABLE_OR_COL category_id) 0)) (> (TOK_TABLE_OR_COL gmv_trade_num) 0))) (TOK_GROUPBY (TOK_TABLE_OR_COL category_id) (TOK_TABLE_OR_COL user_id)))) t2) (= (. (TOK_TABLE_OR_COL t1) user_id) (. (TOK_TABLE_OR_COL t2) user_id)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL t1) category_id) category_id1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL t2) category_id) category_id2) (TOK_SELEXPR (. (TOK_TABLE_OR_COL t1) user_id))))) t1)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL category_id1)) (TOK_SELEXPR (TOK_TABLE_OR_COL category_id2)) (TOK_SELEXPR (TOK_FUNCTIONDI count (TOK_TABLE_OR_COL user_id)) assoc_idx)) (TOK_GROUPBY (TOK_TABLE_OR_COL category_id1) (TOK_TABLE_OR_COL category_id2)))) t_o)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR 20100426) (TOK_SELEXPR (TOK_TABLE_OR_COL category_id1)) (TOK_SELEXPR (TOK_TABLE_OR_COL category_id2)) (TOK_SELEXPR (TOK_TABLE_OR_COL assoc_idx))) (TOK_WHERE (and (<> (TOK_TABLE_OR_COL category_id1) (TOK_TABLE_OR_COL category_id2)) (> (TOK_TABLE_OR_COL assoc_idx) 2))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1, Stage-4 Stage-3 depends on stages: Stage-2 Stage-4 is a root stage Stage-2 depends on stages: Stage-1, Stage-4 Stage-3 depends on stages: Stage-2 Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: t_o:t1:t1:dm_fact_buyer_prd_info_d TableScan alias: dm_fact_buyer_prd_info_d Filter Operator predicate: expr: (((((UDFToDouble(ds) <= UDFToDouble(20100426)) and (UDFToDouble(ds) > UDFToDouble(20100419))) and (UDFToDouble(category_id) > UDFToDouble(0))) and (gmv_trade_num > 0)) and (category_id <> user_id)) type: boolean Filter Operator predicate: expr: ((((UDFToDouble(ds) <= UDFToDouble(20100426)) and (UDFToDouble(ds) > UDFToDouble(20100419))) and (UDFToDouble(category_id) > UDFToDouble(0))) and (gmv_trade_num > 0)) type: boolean Select Operator expressions: expr: category_id type: string expr: user_id type: int outputColumnNames: category_id, user_id Group By Operator keys: expr: category_id type: string expr: user_id type: int mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: int sort order: ++ Map-reduce partition columns: expr: _col0 type: string expr: _col1 type: int tag: -1 Reduce Operator Tree: Group By Operator keys: expr: KEY._col0 type: string expr: KEY._col1 type: int mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: int outputColumnNames: _col0, _col1 File Output Operator compressed: true GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: $INTNAME Reduce Output Operator key expressions: expr: _col1 type: int sort order: + Map-reduce partition columns: expr: _col1 type: int tag: 0 value expressions: expr: _col0 type: string expr: _col1 type: int $INTNAME1 Reduce Output Operator key expressions: expr: _col1 type: int sort order: + Map-reduce partition columns: expr: _col1 type: int tag: 1 value expressions: expr: _col0 type: string Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col1} 1 {VALUE._col0} outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col2 type: string expr: _col1 type: int outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: int outputColumnNames: _col0, _col1, _col2 Group By Operator aggregations: expr: count(DISTINCT _col2) keys: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: int mode: hash outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: true GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: file:/group/tbdev/shaojie/scratch/420686432\10003 Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: int sort order: +++ Map-reduce partition columns: expr: _col0 type: string expr: _col1 type: string tag: -1 value expressions: expr: _col3 type: bigint Reduce Operator Tree: Group By Operator aggregations: expr: count(DISTINCT KEY._col2) keys: expr: KEY._col0 type: string expr: KEY._col1 type: string mode: mergepartial outputColumnNames: _col0, _col1, _col2 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 Filter Operator predicate: expr: ((_col0 <> _col1) and (UDFToDouble(_col2) > UDFToDouble(2))) type: boolean Select Operator expressions: expr: 20100426 type: int expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: t_o:t1:t2:dm_fact_buyer_prd_info_d TableScan alias: dm_fact_buyer_prd_info_d Filter Operator predicate: expr: ((((UDFToDouble(ds) <= UDFToDouble(20100426)) and (UDFToDouble(ds) > UDFToDouble(20100419))) and (UDFToDouble(category_id) > UDFToDouble(0))) and (gmv_trade_num > 0)) type: boolean Filter Operator predicate: expr: ((((UDFToDouble(ds) <= UDFToDouble(20100426)) and (UDFToDouble(ds) > UDFToDouble(20100419))) and (UDFToDouble(category_id) > UDFToDouble(0))) and (gmv_trade_num > 0)) type: boolean Select Operator expressions: expr: category_id type: string expr: user_id type: int outputColumnNames: category_id, user_id Group By Operator keys: expr: category_id type: string expr: user_id type: int mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string expr: _col1 type: int sort order: ++ Map-reduce partition columns: expr: _col0 type: string expr: _col1 type: int tag: -1 Reduce Operator Tree: Group By Operator keys: expr: KEY._col0 type: string expr: KEY._col1 type: int mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: int outputColumnNames: _col0, _col1 File Output Operator compressed: true GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat Stage: Stage-0 Fetch Operator limit: -1 -------------------------------- Well, I tried disabling predicate push down (set hive.optimize.ppd=true), the error is gone; I tried disabling map side aggregate, the error is gone,too. Anybody knows what the problem is? Please give me some advice. -- Best Regards, Ted Xu --0016e6493f6ad9e37804865c9d86 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi all,

I think I found a bug, I'm not sure whether = the problem is at optimizer (PPD) or at map side aggregate.

<= /div>
See query listed below:

----------------= ---------------------

create table if not exists dm_fact_buyer_prd_info_= d (
category_id string
,gmv_trade_num =A0int
,use= r_id =A0 =A0int
)
PARTITIONED BY (ds int);

<= div>set hive.optimize.ppd=3Dtrue;
set hive.map.aggr=3Dtrue;

explain select=A020= 100426,=A0category_id1,category_id2,assoc_idx
from (
select=A0
category= _id1
= , category_id2
, count(distinct user_id) as assoc_idx
from= (
select=A0
t1.category_id as category_id1
, = t2.category_id as category_id2
, t1.user_id
from (
se= lect category_id, user_id
from dm_fact_buyer_prd_info_d
where ds <= =3D 20100426
an= d ds > 20100419
and category_id =A0>0
and gmv_trade_num>0
gr= oup by category_id, user_id ) t1
join (
select category_id, user_id
fr= om dm_fact_buyer_prd_info_d
where ds <=3D 20100426
and ds >201= 00419
an= d category_id >0
and gmv_trade_num >0
group by category_id, use= r_id ) t2 on t1.user_id=3Dt2.user_id=A0
) t= 1
group by category_id1, category_id2 ) t_o
where category_id1 <>= category_id2
and= assoc_idx > 2;

-------------------------------= -

The query above will fail when execute, throwing= exception: "can not cast UDFOpNotEqual(Text, IntWritable) to=A0UDFOpN= otEqual(Text, Text)".=A0

I explained the query and the execute plan looks really= wired (see the=A0highlighted=A0predicate):

------= --------------------------

ABSTRACT SYNTAX TR= EE:

=A0=A0(TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TO= K_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUER= Y (TOK_FROM (TOK_TABREF dm_fact_buyer_prd_info_d)) (TOK_INSERT (TOK_DESTINA= TION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL cat= egory_id)) (TOK_SELEXPR (TOK_TABLE_OR_COL user_id))) (TOK_WHERE (and (and (= and (<=3D (TOK_TABLE_OR_COL ds) 20100426) (> (TOK_TABLE_OR_COL ds) 20= 100419)) (> (TOK_TABLE_OR_COL category_id) 0)) (> (TOK_TABLE_OR_COL g= mv_trade_num) 0))) (TOK_GROUPBY (TOK_TABLE_OR_COL category_id) (TOK_TABLE_O= R_COL user_id)))) t1) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF dm_fac= t_buyer_prd_info_d)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (= TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL category_id)) (TOK_SELEXPR (TOK_T= ABLE_OR_COL user_id))) (TOK_WHERE (and (and (and (<=3D (TOK_TABLE_OR_COL= ds) 20100426) (> (TOK_TABLE_OR_COL ds) 20100419)) (> (TOK_TABLE_OR_C= OL category_id) 0)) (> (TOK_TABLE_OR_COL gmv_trade_num) 0))) (TOK_GROUPB= Y (TOK_TABLE_OR_COL category_id) (TOK_TABLE_OR_COL user_id)))) t2) (=3D (. = (TOK_TABLE_OR_COL t1) user_id) (. (TOK_TABLE_OR_COL t2) user_id)))) (TOK_IN= SERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (= TOK_TABLE_OR_COL t1) category_id) category_id1) (TOK_SELEXPR (. (TOK_TABLE_= OR_COL t2) category_id) category_id2) (TOK_SELEXPR (. (TOK_TABLE_OR_COL t1)= user_id))))) t1)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TO= K_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL category_id1)) (TOK_SELEXPR (TOK_TA= BLE_OR_COL category_id2)) (TOK_SELEXPR (TOK_FUNCTIONDI count (TOK_TABLE_OR_= COL user_id)) assoc_idx)) (TOK_GROUPBY (TOK_TABLE_OR_COL category_id1) (TOK= _TABLE_OR_COL category_id2)))) t_o)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR = TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR 20100426) (TOK_SELEXPR (TOK_TABLE_O= R_COL category_id1)) (TOK_SELEXPR (TOK_TABLE_OR_COL category_id2)) (TOK_SEL= EXPR (TOK_TABLE_OR_COL assoc_idx))) (TOK_WHERE (and (<> (TOK_TABLE_OR= _COL category_id1) (TOK_TABLE_OR_COL category_id2)) (> (TOK_TABLE_OR_COL= assoc_idx) 2)))))



STAGE DEPENDENCIES:
=

=A0=A0Stage-1 is a root stage

= =A0=A0Stage-2 depends on stages: Stage-1, Stage-4

= =A0=A0Stage-3 depends on stages: Stage-2

=A0=A0Stage-4 is a root stage

= =A0=A0Stage-2 depends on stages: Stage-1, Stage-4

= =A0=A0Stage-3 depends on stages: Stage-2

=A0=A0Sta= ge-0 is a root stage



STAGE PLANS:
=A0=A0Stage: Stage-1
=A0=A0 =A0Map Reduce
=
=A0=A0 =A0 =A0Alias -> Map Operator Tree:

<= /div>
=A0=A0 =A0 =A0 =A0t_o:t1:t1:dm_fact_buyer_prd_info_d=A0

=A0=A0 =A0 =A0 =A0 =A0TableScan

=A0=A0 =A0 =A0 =A0 =A0 =A0alias: dm_fact_buyer_prd_info_d

<= /div>
=A0=A0 =A0 =A0 =A0 =A0 =A0Filter Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0predicate:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: (((((UDFToDouble= (ds) <=3D UDFToDouble(20100426)) and (UDFToDouble(ds) > UDFToDouble(2= 0100419))) and (UDFToDouble(category_id) > UDFToDouble(0))) and (gmv_tra= de_num > 0)) and (cat= egory_id <> user_id))

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: boolean

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0Filter Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0predicate:
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: ((((UDFToDoubl= e(ds) <=3D UDFToDouble(20100426)) and (UDFToDouble(ds) > UDFToDouble(= 20100419))) and (UDFToDouble(category_id) > UDFToDouble(0))) and (gmv_tr= ade_num > 0))

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: boolean=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Select Operator=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expressions= :

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0expr: category_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type:= string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0expr: user_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0outputColumnNames: category_id, user_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Group By Operator=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0keys:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0expr: category_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0e= xpr: user_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0mode: hash

=A0=A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Reduce Output= Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0key expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0

=A0=A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0sort order: ++

=A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Map-reduce partition columns:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0expr: _col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tag: -1

=A0=A0 =A0 =A0Reduce Operator Tree:

<= /div>
=A0=A0 =A0 =A0 =A0Group By Operator

=A0= =A0 =A0 =A0 =A0 =A0keys:

=A0=A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0expr: KEY._col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: KEY._col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int

=
=A0=A0 =A0 =A0 =A0 =A0mode: mergepartial

=A0=A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1
<= div>
=A0=A0 =A0 =A0 =A0 =A0Select Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0expressions:

=A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int
<= br>
=A0=A0 =A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0File Output Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0compressed: true
<= br>
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0GlobalTableId: 0

=
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0table:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0input format: org.apac= he.hadoop.mapred.SequenceFileInputFormat

=A0=A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0output format: org.apache.hadoop.hive.ql.io.= HiveSequenceFileOutputFormat



=A0=A0Stage: Stage-2
=A0=A0 =A0Map Reduce

=A0=A0 =A0 =A0Alias -> Ma= p Operator Tree:

=A0=A0 =A0 =A0 =A0$INTNAME=A0

=A0=A0 =A0 =A0 =A0 =A0 =A0Reduce Output Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0key expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0sort order: +

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0Map-reduce partition colu= mns:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0exp= r: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0type: int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0tag: 0<= /div>

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0value expressions:
<= div>
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: stri= ng

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr:= _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 =A0$INTNAME1=A0

=A0=A0 =A0 =A0 =A0 =A0 =A0Reduce Output Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0key expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1
=
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0sort order: +

=
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0Map-reduce partition columns:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0tag: 1
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0value expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0
<= div>
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0Reduce Operator Tree:

<= /div>
=A0=A0 =A0 =A0 =A0Join Operator

=A0=A0 =A0 =A0 =A0 =A0condition map:

=A0=A0 = =A0 =A0 =A0 =A0 =A0 =A0 Inner Join 0 to 1

=A0=A0 = =A0 =A0 =A0 =A0condition expressions:

=A0=A0 =A0 = =A0 =A0 =A0 =A00 {VALUE._col0} {VALUE._col1}

=A0=A0 =A0 =A0 =A0 =A0 =A01 {VALUE._col0}

=A0=A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1, _col2

=A0=A0 =A0 =A0 =A0 =A0Select Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0
=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col2

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1
=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1= , _col2

=A0=A0 =A0 =A0 =A0 =A0 =A0Select Operator<= /div>

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0expressions:
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string<= /div>

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _c= ol2

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type= : int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0outputColumnNa= mes: _col0, _col1, _col2

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0Group By Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0aggregations:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: count(D= ISTINCT _col2)

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0k= eys:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _co= l0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0t= ype: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _co= l2

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0t= ype: int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mode: h= ash

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0outputColumn= Names: _col0, _col1, _col2, _col3

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0File Output Operator<= /div>

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0compressed: = true

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0GlobalT= ableId: 0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ta= ble:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0input for= mat: org.apache.hadoop.mapred.SequenceFileInputFormat

<= div>=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0output format: org.apache= .hadoop.hive.ql.io.HiveSequenceFileOutputFormat



=A0=A0Stage: Stage-3
=A0=A0 =A0Map Reduce

=A0=A0 =A0 =A0Alias -&g= t; Map Operator Tree:

=A0=A0 =A0 =A0 =A0file:/grou= p/tbdev/shaojie/scratch/420686432\10003=A0

=A0=A0 =A0 =A0 =A0 =A0 =A0Reduce Output Operator
<= div>
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0key expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0
=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string<= /div>

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: str= ing

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr= : _col2

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= type: int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0sort order: +++

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0Map-reduce partition columns:=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _= col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0typ= e: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: str= ing

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0tag: -1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0value expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col3
<= div>
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: bigint

=A0=A0 =A0 =A0Reduce Operator Tree:

<= /div>
=A0=A0 =A0 =A0 =A0Group By Operator

=A0=A0 =A0 =A0 =A0 =A0aggregations:

=A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: count(DISTINCT KEY._col2)
=A0=A0 =A0 =A0 =A0 =A0keys:

=A0=A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: KEY._col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: KEY._col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

= =A0=A0 =A0 =A0 =A0 =A0mode: mergepartial

=A0=A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1, _col2

=
=A0=A0 =A0 =A0 =A0 =A0Select Operator

= =A0=A0 =A0 =A0 =A0 =A0 =A0expressions:

=A0=A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col2

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: bigint

=A0=A0 =A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1, _col2

=A0=A0 =A0 =A0 =A0 =A0 =A0Filter Operator
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0predicate:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: ((_col0 <>= _col1) and (UDFToDouble(_col2) > UDFToDouble(2)))

<= div>=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: boolean

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0Select Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: 201004= 26

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0t= ype: int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0expr: _col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: str= ing

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0expr: _col2

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: big= int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0outputColumn= Names: _col0, _col1, _col2, _col3

=A0=A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0File Output Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0compressed: false

=
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0GlobalTableId: 0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0table:

=
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0input format: org.= apache.hadoop.mapred.TextInputFormat

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0output fo= rmat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
=


=A0=A0Stage: Stage-4
=A0=A0 =A0Map Reduce

=A0=A0 =A0 =A0Alias -> Map Operator Tree:

<= /div>
=A0=A0 =A0 =A0 =A0t_o:t1:t2:dm_fact_buyer_prd_info_d=A0

=A0=A0 =A0 =A0 =A0 =A0TableScan

= =A0=A0 =A0 =A0 =A0 =A0 =A0alias: dm_fact_buyer_prd_info_d

=A0=A0 =A0 =A0 =A0 =A0 =A0Filter Operator
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0predicate:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: ((((UDFToDouble(ds) <=3D = UDFToDouble(20100426)) and (UDFToDouble(ds) > UDFToDouble(20100419))) an= d (UDFToDouble(category_id) > UDFToDouble(0))) and (gmv_trade_num > 0= ))

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: boolean

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0Filter Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0predicate:
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: ((((UDFToDoubl= e(ds) <=3D UDFToDouble(20100426)) and (UDFToDouble(ds) > UDFToDouble(= 20100419))) and (UDFToDouble(category_id) > UDFToDouble(0))) and (gmv_tr= ade_num > 0))

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: boolean=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Select Operator=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expressions= :

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0expr: category_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type:= string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0expr: user_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0outputColumnNames: category_id, user_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Group By Operator=

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0keys:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0expr: category_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0e= xpr: user_id

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0mode: hash

=A0=A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Reduce Output= Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0key expressions:

=A0=A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0

=A0=A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0sort order: ++

=A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Map-reduce partition columns:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0expr: _col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tag: -1

=A0=A0 =A0 =A0Reduce Operator Tree:

<= /div>
=A0=A0 =A0 =A0 =A0Group By Operator

=A0= =A0 =A0 =A0 =A0 =A0keys:

=A0=A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0expr: KEY._col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: KEY._col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int

=
=A0=A0 =A0 =A0 =A0 =A0mode: mergepartial

=A0=A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1
<= div>
=A0=A0 =A0 =A0 =A0 =A0Select Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0expressions:

=A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col0

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: string

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0expr: _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: int
<= br>
=A0=A0 =A0 =A0 =A0 =A0 =A0outputColumnNames: _col0, _col1

=A0=A0 =A0 =A0 =A0 =A0 =A0File Output Operator

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0compressed: true
<= br>
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0GlobalTableId: 0

=
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0table:

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0input format: org.apac= he.hadoop.mapred.SequenceFileInputFormat

=A0=A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0output format: org.apache.hadoop.hive.ql.io.= HiveSequenceFileOutputFormat



=A0=A0Stage: Stage-0
=A0=A0 =A0Fetch Operator

=A0=A0 =A0 =A0limit: -1<= /div>

--------------------------------
<= br>

Well, I tried disabling predicate push down (set hive.optimi= ze.ppd=3Dtrue), the error is gone; I tried disabling map side aggregate, th= e error is gone,too.=A0

Anybody knows what the pro= blem is? Please give me some advice.


--
Best Regards,
Ted Xu
--0016e6493f6ad9e37804865c9d86--