hawq-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kavinder Dhaliwal <kdhali...@pivotal.io>
Subject Re: hawq-site
Date Tue, 27 Sep 2016 19:16:05 GMT
Hi,

Could you please describe a bit more about the nature of your tables in
hive. What file formats are the underlying files? What is the total size of
the files?

Additionally, what the Java Heap size you have specified for pxf? and can
you share your DDL for creating the table.

For more debugging information I would suggest you tail both
/var/log/pxf/catalina.out and /var/log/pxf/pxf-service.log and then
run select count(1) from hcatalog.default.t1; It will be obvious from
catalina.out if there is insufficient memory allocated to pxf.

This will give more insight into where your problem is. Currently there is
a known issue with PXF, large Hive files (mainly compressed formats), and
an insufficient Java Heap size. After trying the above, perhaps you can try
increasing your heap size via the "-Xmx<SIZE>" jvm property and running
again.

Thanks,
Kavinder

On Mon, Sep 26, 2016 at 10:32 PM, 来熊 <yin.zhb@163.com> wrote:

> hi,all:
> I am using PXF and hcatalog to query hive, table t1,t2 in hive, and t1 is
> large table. and hawq in yarn mode.
>
> [hive@master ~]$ hive
> hive> select count(1) from t1;
> OK
> 680852926
> Time taken: 0.721 seconds, Fetched: 1 row(s)
> hive> exit;
>
> when I query t1 in hawq ,it is very very slow:
>
> [gpadmin@master ~]$
> [gpadmin@master ~]$ psql -U gpadmin -d gpadmin
> gpadmin=# set pxf_service_address to 'master:51200';
> SET
> Time: 0.410 ms
> gpadmin=# select count(*) from hcatalog.default.t2;
>  count
> -------
>   1000
> (1 row)
>
> Time: 910.853 ms
>
> gpadmin=# explain select count(*) from hcatalog.default.t1;
>                                              QUERY PLAN
>
> ------------------------------------------------------------
> ----------------------------------------
>  Aggregate  (cost=0.00..431.00 rows=1 width=8)
>    ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..431.00 rows=1
> width=8)
>          ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
>                ->  External Scan on t1  (cost=0.00..431.00 rows=1 width=1)
>  Optimizer status: PQO version 1.627
> (5 rows)
>
> Time: 1388.073 ms
> gpadmin=#
> gpadmin=# select count(1) from hcatalog.default.t1;
>
> wait a long time,and cannot get result.
> log messages:
>
> 2016-09-27 09:46:25.816366 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5.
> Registered in HAWQ resource manager (By OID)",,,,,,"select count(*) from
> hcatalog.default.t1;",0,,"rmcomm_QD2RM.c",609,
> 2016-09-27 09:46:25.816508 CST,,,p760393,th-1935386496,,,
> ,0,con4,,seg-10000,,,,,"LOG","00000","ConnID 5. Expect query resource
> (256 MB, 0.022727 CORE) x 1 ( MIN 1 ) resource after adjusting based on
> queue NVSEG limits.",,,,,,,0,,"resqueuemanager.c",1913,
> 2016-09-27 09:46:25.816603 CST,,,p760393,th-1935386496,,,
> ,0,con4,,seg-10000,,,,,"LOG","00000","Latency of getting resource
> allocated is 138us",,,,,,,0,,"resqueuemanager.c",4375,
> 2016-09-27 09:46:25.816743 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","ConnID 5.
> Acquired resource from resource manager, (256 MB, 0.022727 CORE) x
> 1.",,,,,,"select count(*) from hcatalog.default.t1;",0,,"
> rmcomm_QD2RM.c",868,
> 2016-09-27 09:46:25.816868 CST,"gpadmin","gpadmin",
> p764498,th-1935386496,"10.0.230.20","16234",2016-09-27 09:31:13
> CST,90355,con51,cmd20,seg-1,,,x90355,sx1,"LOG","00000","data locality
> ratio: 0.000; virtual segment number: 1; different host number: 1; virtual
> segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max):
> (0.000 B/0 B/0 B); segment size with penalty(avg/min/max): (0.000 B/0 B/0
> B); continuity(avg/min/max): (0.000/0.000/0.000).",,,,,,"select count(*)
> from hcatalog.default.t1;",0,,"cdbdatalocality.c",3396,
>
>
> I don't know why hawq only get such little resources.
> Is there any parameters I can set to let it (query hive using pxf and
> hcatalog) faster like in hive directly.
>
>
>

Mime
View raw message