From dev-return-55943-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org Tue Apr 2 16:31:37 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 6E610180668 for ; Tue, 2 Apr 2019 18:31:36 +0200 (CEST) Received: (qmail 15706 invoked by uid 500); 2 Apr 2019 16:31:35 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 15694 invoked by uid 99); 2 Apr 2019 16:31:34 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Apr 2019 16:31:34 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 581B3C1EDB for ; Tue, 2 Apr 2019 16:31:34 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.81 X-Spam-Level: * X-Spam-Status: No, score=1.81 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=cloudera.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 1I1eem1Xt8j6 for ; Tue, 2 Apr 2019 16:31:32 +0000 (UTC) Received: from mail-yb1-f181.google.com (mail-yb1-f181.google.com [209.85.219.181]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 50BA1624DA for ; Tue, 2 Apr 2019 16:31:32 +0000 (UTC) Received: by mail-yb1-f181.google.com with SMTP id s7so2033305ybo.3 for ; Tue, 02 Apr 2019 09:31:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=c1IFqd8u+iXcRAhRJ/5CK7VJQbAzFFIlTAt7NaFlMXk=; b=Tfwjupm5i2+hMUf5KmXSZPvU0Iq8ttOhuf4Ej4pS31HVwfrsK4VWJmF9YpTKzlUfa6 M4fuYDsCKRv4nPM2NGbAW2RlUxK3cbH8bnKwuAMzHXF0ypKh/oHHUdSH3DaRhgaJTJxH 2eH7c16ZiJeiTNc5hNBAgIHVdqdUr10fyPxo3X0TSylNzNkZFicpXUFR/81wz6wx2xn1 7usQIZfhWKMJLNEmHez7UPWmvfUZjAp6sBTupPamrv8HljMd34DMCQ4kIvzXhCDltMB+ A3zidAbrTjLJa8xnPn2fHQ9hccFLhldbCfZrYRONSgBgiTtUHSkl89enmQaHuKyDBHq8 iefQ== X-Gm-Message-State: APjAAAU8ZWFqGTbDPAi673kQsrncxTUw+lvYdY7trX+pXtukTA0I80vf VZcu+SScn3TOa+XjVbT2yL88GL6Bkyj/SXzZoolsPUFu1WI= X-Google-Smtp-Source: APXvYqxyOz0sr1en1AZHhNgXwZIGkkSaUX8semMIUMVKE6sGc2SkDO5fJ8FU7Xc97fxr0NWhnRUt8ceFkvGxGPljch8= X-Received: by 2002:a25:b45:: with SMTP id 66mr17436707ybl.145.1554222684866; Tue, 02 Apr 2019 09:31:24 -0700 (PDT) MIME-Version: 1.0 References: <7BB0ECEE-9B9E-4EF6-B9AB-A3574B63E491@gmail.com> In-Reply-To: From: Varun Rao Date: Tue, 2 Apr 2019 12:31:13 -0400 Message-ID: Subject: Re: About mapping a phoenix table To: dev@phoenix.apache.org Content-Type: multipart/alternative; boundary="00000000000008ab9405858eabea" --00000000000008ab9405858eabea Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello Reid, I dont know if this fits your use case but there is a way of copying data from a Phoenix Table to another Phoenix table in another cluster if data is not present yet in either table. We can use the fact that Phoenix stores its metadata using HBase tables. Therefore by enabling replication on the underlying source HBase table, adding the destination cluster as a peer in the source hbase cluster, and setting the source hbase table's replication scope to 1 any data flowing into the source phoenix table will be copied to a destination phoenix table= . 1) Create Phoenix tables on source and destination cluster 2) Set replication=3Dtrue on source hbase cluster through cm 3) Add peer on source cluster via hbase shell *add_peer '1=E2=80=99, CLUSTER_KEY =3D> "some_node:2181:/hbase", TABLE_CFS = =3D> {"TABLE_NAME" =3D> ["column_1", "column_2"]}* Here you can specify which columns you would like to copy over 4) Disable the source Phoenix table, set replication scope as 1, re enable it *disable =E2=80=98SOURCE_TABLE=E2=80=99* *alter =E2=80=98BIGTABLE_PHOENIX=E2=80=99, {NAME=3D>=E2=80=99w_a=E2=80=99, = REPLICATION_SCOPE=3D>=E2=80=991=E2=80=99}, {NAME=3D>=E2=80=99w_b=E2=80=99, REPLICATION_SCOPE=3D>=E2=80=991=E2=80=99}* *enable =E2=80=98BIGTABLE_PHOENIX=E2=80=99* 5) Send data. In my test case I used psql to send 2 million records from CSV to the phoenix source table *phoenix-psql.py -t BIGTABLE_PHOENIX localhost:2181 wine_mag.csv * 6) You can now see the same data in the source and target cluster Thanks Yours Truly, Varun Rao On Tue, Apr 2, 2019 at 12:04 PM Nick Dimiduk wrote: > Hi Reid, > > I'll throw my +1 onto Anil's Approach #1. I followed this path recently t= o > migrate all of our production data. Migrating Phoenix metadata by creatin= g > tables manually on the destination is a little clunky, but HBase Snapshot= s > are quite easy to work with. > > Good luck, > Nick > > On Tue, Apr 2, 2019 at 5:26 AM anil gupta wrote: > > > Hey Reid, > > AFAIK, there is no official Phoenix tool to copy table between clusters= . > > IMO, it would be great to have an official tool to copy tables. > > In our case, source and destination clusters are running Phoenix4.7. IM= O, > > copy between 4.7-4.14 might have some version incompatibility. So, you > > might need to test following in non-prod first. > > > > Approach 1: We usually move tables by taking a snapshot of hbase table, > > exporting the snapshot to remote cluster, create Phoenix table, delete > > underlying hbase table, and restoring the snapshot. Please keep in mind > > that you will need to do similar exercise if your table has secondary > > indexes since they are stored in another hbase table. Also, make sure > that > > you don=E2=80=99t have any live traffic to Phoenix table in destination= cluster > > until restoring of snapshot and verification of data in table. > > > > Approach 2: Use copyTable util of hbase. In this case, you will just ne= ed > > to create Phoenix table on remote cluster and then kick off hbase copy > > table. In this approach also, you will need to perform copyTable for ea= ch > > secondary index. > > > > We usually use approach1 because it=E2=80=99s usually faster and doesn= =E2=80=99t puts > > write load on cluster. > > > > HTH, > > Anil Gupta > > > > > On Apr 2, 2019, at 4:32 AM, Reid Chan wrote: > > > > > > Hi team, > > > > > > I'm trying to transport a phoenix table between two clusters, by > copying > > all related hbase files on hdfs from cluster A to cluster B. > > > But after i executed CreateTableStatement in phoenix, phoenix failed = to > > map those files into table, and `select *` got nothing. > > > > > > The questions are, > > > Is there a proper way or tool to do the table transportation? > > > If answer is no, can team provide some code pointers if i want to > > implement it? > > > Or reason why is this infeasible? > > > > > > FYI, > > > both hbase version are both 1.x but different in minor version, > > > phoenix version gap is huge, 4.7.0 and 4.14.1. > > > > > > Any suggestions are appreciated! > > > Thanks > > > > > > > > > -------------------------- > > > > > > Best regards, > > > R.C > > > > > > > > > --00000000000008ab9405858eabea--