hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hong Wu <xunzhang...@gmail.com>
Subject Re: Options Usage of "hawq register"
Date Tue, 16 Aug 2016 05:32:58 GMT
@Lei, I noticed that and thanks for pointing it out. The updated interface
should be like this:

- Case I
hadoop fs -put -f hdfs://localhost:8020/hive/original_data.paq
hdfs://localhost:8020/test_data.paq;

create table t1(i int) with (appendonly = true, orientation=parquet);

hawq register -h localhost -p 5432 -u me -d postgres -f
hdfs://localhost:8020/test_data.paq t1;

- Case II
hawq extract -o t1.yml t1;

hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml t1;

Cheers
Hong

2016-08-16 11:34 GMT+08:00 Lili Ma <lma@pivotal.io>:

> @Lei, Since current hawq register supports two cases: Specifying tablename
> & filepath, or specifying .yml file, we proposed this usage.
>
> For the first case, we can follow the patten of "hawq command". change the
> tablename to object, such as *hawq register [-h hostname] [-p port] [-U
> username] [-d database]  [-f filepath] [-c config] <tablename>*
>
> But for the second case, because the table name is inside .yml file, if we
> change tablename to object and mark it as a necessary field, it's
> duplicated with the name inside configure file. And what shall we do for
> the name conflicts?
>
> Could you suggest a better way for defining the usage? Thanks :)
>
> Thanks
> Lili
>
> On Tue, Aug 16, 2016 at 8:23 AM, Lei Chang <lei_chang@apache.org> wrote:
>
> > I think this is an very useful feature for backup/restore, disaster
> > recovery and some other scenarios.
> >
> > From the usage side, "hawq register" follows the typically "hawq command"
> > design pattern: that is, "hawq action object". But for "hawq register",
> > there is no "object" here.
> >
> > ---------------------------
> > hawq extract -o t1.yml t1;
> > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml;
> > ---------------------------
> >
> > Cheers
> > Lei
> >
> >
> > On Mon, Aug 15, 2016 at 3:25 PM, Hong Wu <xunzhangthu@gmail.com> wrote:
> >
> > > Hi HAWQ developers,
> > >
> > > This thread means to confirm the option usage of hawq register.
> > >
> > > There will be two scenarios for users to use the hawq register tool so
> > far.
> > > - I. Register external parquet data into HAWQ. For example, users want
> to
> > > migrate parquet tables from HIVE to HAWQ as quick as possible. In this
> > > case, only parquet format is supported and the original parquet files
> in
> > > hive are moved.
> > >
> > > - II. User should be able to use hawq register to register table files
> > into
> > > a new HAWQ cluster. It is some kind of protecting against corruption
> from
> > > users' perspective. Users use the last-known-good metadata to update
> the
> > > portion of catalog managing HDFS blocks. The table files or dictionary
> > > should be backuped(such as using distcp) into the same path in the new
> > HDFS
> > > setting. And in this case, both AO and Parquet formats are supported.
> > >
> > > Considering above cases, the designed options for hawq register looks
> > > below:
> > >
> > > hawq register [-h hostname] [-p port] [-U username] [-d database] [-t
> > > tablename] [-f filepath] [-c config]
> > > Note that the -h, p, -U options are optional, the -c option and the -t,
> > -f
> > > options are mutually exclusive which are corresponding to two different
> > > cases above. Consequently, the expected usage of hawq register should
> be
> > > like below:
> > >
> > > - Case I
> > > hadoop fs -put -f hdfs://localhost:8020/hive/original_data.paq
> > > hdfs://localhost:8020/test_data.paq;
> > >
> > > create table t1(i int) with (appendonly = true, orientation=parquer);
> > >
> > > hawq register -h localhost -p 5432 -u me -d postgres -t t1 -f
> > > hdfs://localhost:8020/test_data.paq;
> > >
> > > - Case II
> > > hawq extract -o t1.yml t1;
> > >
> > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml;
> > >
> > > Incorrect usage(in both of these cases, hawq resgiter will print an
> error
> > > and then exit):
> > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1;
> > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -f
> > > hdfs://localhost:8020/test_data.paq;
> > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1 -f
> > > hdfs://localhost:8020/test_data.paq;
> > >
> > > Does this design make sense, any comments? Thanks.
> > >
> > > Best
> > > Hong
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message