From 来熊 <yin....@163.com>
Subject hawq-site
Date Tue, 27 Sep 2016 05:32:03 GMT
I am using PXF and hcatalog to query hive, table t1,t2 in hive, and t1 is large table. and
hawq in yarn mode.

[hive@master ~]$ hive
hive> select count(1) from t1;
Time taken: 0.721 seconds, Fetched: 1 row(s)
hive> exit;

when I query t1 in hawq ,it is very very slow:

[gpadmin@master ~]$ 
[gpadmin@master ~]$ psql -U gpadmin -d gpadmin 
gpadmin=# set pxf_service_address to 'master:51200';
Time: 0.410 ms
gpadmin=# select count(*) from hcatalog.default.t2;
(1 row)

Time: 910.853 ms

gpadmin=# explain select count(*) from hcatalog.default.t1;
                                             QUERY PLAN                                  
 Aggregate  (cost=0.00..431.00 rows=1 width=8)
   ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..431.00 rows=1 width=8)
         ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
               ->  External Scan on t1  (cost=0.00..431.00 rows=1 width=1)
 Optimizer status: PQO version 1.627
(5 rows)

Time: 1388.073 ms
gpadmin=# select count(1) from hcatalog.default.t1;

wait a long time,and cannot get result.
log messages:

2016-09-27 09:46:25.816366 CST,"gpadmin","gpadmin",p764498,th-1935386496,"","16234",2016-09-27
09:31:13 CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5. Registered in HAWQ
resource manager (By OID)",,,,,,"select count(*) from hcatalog.default.t1;",0,,"rmcomm_QD2RM.c",609,
2016-09-27 09:46:25.816508 CST,,,p760393,th-1935386496,,,,0,con4,,seg-10000,,,,,"LOG","00000","ConnID
5. Expect query resource (256 MB, 0.022727 CORE) x 1 ( MIN 1 ) resource after adjusting based
on queue NVSEG limits.",,,,,,,0,,"resqueuemanager.c",1913,
2016-09-27 09:46:25.816603 CST,,,p760393,th-1935386496,,,,0,con4,,seg-10000,,,,,"LOG","00000","Latency
of getting resource allocated is 138us",,,,,,,0,,"resqueuemanager.c",4375,
2016-09-27 09:46:25.816743 CST,"gpadmin","gpadmin",p764498,th-1935386496,"","16234",2016-09-27
09:31:13 CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5. Acquired resource
from resource manager, (256 MB, 0.022727 CORE) x 1.",,,,,,"select count(*) from hcatalog.default.t1;",0,,"rmcomm_QD2RM.c",868,
2016-09-27 09:46:25.816868 CST,"gpadmin","gpadmin",p764498,th-1935386496,"","16234",2016-09-27
09:31:13 CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","data locality ratio: 0.000;
virtual segment number: 1; different host number: 1; virtual segment number per host(avg/min/max):
(1/1/1); segment size(avg/min/max): (0.000 B/0 B/0 B); segment size with penalty(avg/min/max):
(0.000 B/0 B/0 B); continuity(avg/min/max): (0.000/0.000/0.000).",,,,,,"select count(*) from

I don't know why hawq only get such little resources.
Is there any parameters I can set to let it (query hive using pxf and hcatalog) faster like
in hive directly.

