hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lei Chang <lei_ch...@apache.org>
Subject Re: Options Usage of "hawq register"
Date Tue, 16 Aug 2016 00:23:58 GMT
I think this is an very useful feature for backup/restore, disaster
recovery and some other scenarios.

>From the usage side, "hawq register" follows the typically "hawq command"
design pattern: that is, "hawq action object". But for "hawq register",
there is no "object" here.

---------------------------
hawq extract -o t1.yml t1;
hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml;
---------------------------

Cheers
Lei


On Mon, Aug 15, 2016 at 3:25 PM, Hong Wu <xunzhangthu@gmail.com> wrote:

> Hi HAWQ developers,
>
> This thread means to confirm the option usage of hawq register.
>
> There will be two scenarios for users to use the hawq register tool so far.
> - I. Register external parquet data into HAWQ. For example, users want to
> migrate parquet tables from HIVE to HAWQ as quick as possible. In this
> case, only parquet format is supported and the original parquet files in
> hive are moved.
>
> - II. User should be able to use hawq register to register table files into
> a new HAWQ cluster. It is some kind of protecting against corruption from
> users' perspective. Users use the last-known-good metadata to update the
> portion of catalog managing HDFS blocks. The table files or dictionary
> should be backuped(such as using distcp) into the same path in the new HDFS
> setting. And in this case, both AO and Parquet formats are supported.
>
> Considering above cases, the designed options for hawq register looks
> below:
>
> hawq register [-h hostname] [-p port] [-U username] [-d database] [-t
> tablename] [-f filepath] [-c config]
> Note that the -h, p, -U options are optional, the -c option and the -t, -f
> options are mutually exclusive which are corresponding to two different
> cases above. Consequently, the expected usage of hawq register should be
> like below:
>
> - Case I
> hadoop fs -put -f hdfs://localhost:8020/hive/original_data.paq
> hdfs://localhost:8020/test_data.paq;
>
> create table t1(i int) with (appendonly = true, orientation=parquer);
>
> hawq register -h localhost -p 5432 -u me -d postgres -t t1 -f
> hdfs://localhost:8020/test_data.paq;
>
> - Case II
> hawq extract -o t1.yml t1;
>
> hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml;
>
> Incorrect usage(in both of these cases, hawq resgiter will print an error
> and then exit):
> hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1;
> hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -f
> hdfs://localhost:8020/test_data.paq;
> hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1 -f
> hdfs://localhost:8020/test_data.paq;
>
> Does this design make sense, any comments? Thanks.
>
> Best
> Hong
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message