hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cyrille Lintz <cli...@pivotal.io>
Subject Re: HAWQ: Web external table on segments.
Date Thu, 06 Apr 2017 05:34:21 GMT
Hello,

gpssh could be a solution, but it requires to have an access to the master.
I want to create external tables for some users who don't have access to
Master.

For example, I would like to create an external table in order to monitor
the spill files.

DROP EXTERNAL TABLE IF EXISTS spills ;

CREATE EXTERNAL WEB TABLE spills (hostname text, size text, path text)
EXECUTE E'du -sb /datac/hawq/segment /datad/hawq/segment
/datae/hawq/segment /dataf/hawq/segment /datah/hawq/segment
/datai/hawq/segment /dataj/hawq/segment /datak/hawq/segment| sed
"s/^/$(hostname)\t /"'
ON 3
FORMAT 'TEXT'
(DELIMITER E'\t') ;


Thanks,


*Cyrille LINTZ*Advisory Solution Architect  |  Pivotal Europe South
Mobile: + 33 (0)6 11 48 71 10 | clintz@pivotal.io

2017-04-06 4:13 GMT+02:00 Hubert Zhang <hzhang@pivotal.io>:

> Why not use gpssh to excute shell on each node?
>
> On Wed, Apr 5, 2017 at 3:11 PM, Cyrille Lintz <clintz@pivotal.io> wrote:
>
> > Hello,
> >
> > From the HDB guide (
> > http://hdb.docs.pivotal.io/212/hawq/reference/sql/CREATE-
> > EXTERNAL-TABLE.html#topic1__section4),
> > I read about Web external tables
> >
> > *Note: ON ALL/HOST is deprecated when creating a readable external table,
> > as HAWQ cannot guarantee scheduling executors on a specific host.
> Instead,
> > use ON MASTER, ON <number>, or SEGMENT <virtual_segment> to specify
which
> > segment instances will execute the command.*
> >
> >
> > In my opinion, if possible, we should re-introduce the ON ALL option for
> > the external WEB tables,
> > I am concerned with the option ON <number> in the external web table
> > definition. We have to use the number of current hosts. So if we expand
> the
> > cluster, we will have to change this external web table.
> >
> > - If we have a value smaller than the actual number of hosts, some rows
> > will miss.
> > - If we have a value greater than the actual number of hosts, some rows
> > will be duplicated.
> >
> >
> > If we add the option ON ALL:
> >
> > - it will help to monitor the spill files
> > - it will help to read the segment log files (see the commented DDL
> > hawq_toolkit._hawq_log_segment_ext in the file $GPHOME/share/postgresql)
> >
> >
> > I know that the option ON HOST and ON ALL were deprecated due to elastic
> > runtime in HAWQ 2.x. It is related to the Hadoop architecture.
> >
> > However, how could we execute once a shell on each host of the cluster
> via
> > an external web table?
> > In this case, we are not using Hadoop FS, but local FS.
> >
> > Thanks,
> >
> >
> > *Cyrille LINTZ*Advisory Solution Architect  |  Pivotal Europe South
> > Mobile: + 33 (0)6 11 48 71 10 | clintz@pivotal.io
> >
>
>
>
> --
> Thanks
>
> Hubert Zhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message