Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5FA4B200B9C for ; Mon, 26 Sep 2016 02:05:34 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5E3D2160AE2; Mon, 26 Sep 2016 00:05:34 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 78462160ACE for ; Mon, 26 Sep 2016 02:05:33 +0200 (CEST) Received: (qmail 52245 invoked by uid 500); 26 Sep 2016 00:05:32 -0000 Mailing-List: contact dev-help@hawq.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hawq.incubator.apache.org Delivered-To: mailing list dev@hawq.incubator.apache.org Received: (qmail 52233 invoked by uid 99); 26 Sep 2016 00:05:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Sep 2016 00:05:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id EC2DE1A00A2 for ; Mon, 26 Sep 2016 00:05:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.898 X-Spam-Level: * X-Spam-Status: No, score=1.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id VqMeRfRE14I3 for ; Mon, 26 Sep 2016 00:05:30 +0000 (UTC) Received: from mail-pf0-f173.google.com (mail-pf0-f173.google.com [209.85.192.173]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 994645F484 for ; Mon, 26 Sep 2016 00:05:29 +0000 (UTC) Received: by mail-pf0-f173.google.com with SMTP id q2so59088893pfj.3 for ; Sun, 25 Sep 2016 17:05:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:message-id:in-reply-to:references:subject:mime-version; bh=KPyAeEWpImJ/KNzbMbEOFt9sSffnwWOQUHq5PUzMAiE=; b=JBF8foeGT12ZZRpmBEM3G4Xn2BNspjzvVr1cmRzH9WwzbHI5RIM79Y2++61YESBK4d QErAM6e2mYHCyIg+pwNUSu0V9JK3jqE95BhFJbm6eARU5cyAAv4qWduqHwB9CvSwQDGW i/3ZlJEuOMWVEfyQ9WMNSWhGE+sFmsTZnzhmy5UayCOU+eqlF7VeIcLbRim/SfMeQQ+f ovntKUOQFmwAbzn/HBKGe8HKsDAru8929vVwYkTrkvLQ14TcsFlVOUCPJRpTWcrjWaSA m+nzOHDpM5RspTRcvwy7YV4/eitmwdhS+lwodPyFIWgLkLeKj0wgFV/q12eurLlrAEYg v03g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:message-id:in-reply-to:references :subject:mime-version; bh=KPyAeEWpImJ/KNzbMbEOFt9sSffnwWOQUHq5PUzMAiE=; b=e8ufsgcnwqCBLk5GwWjyIjMwXkSgEgRmV4xK+InBU9De5MspVlZuI86Cwf8zzTBw2o qI9PMxOzNSbE9QfoMlt1DCRYvGWl1+awTTmN1qGM4gP46MDaSxu3mIZDhilLzIOB0DKz nCidVUsul3ArV1OX+uJ2MYSInx0FinbYh8VJ4CWEEVpYkWd8AWRm6JHUSLlJqWHXXVK0 yI01osMyHq8O8yRUigYfB9NytngYe0oexbQu99omrzs+T2/DUxptuVLlG30LVUaH1+Ct neURiGX+BSW+IZQOjDiX7KtvKEluwqUvh3Ky6XZqH62rnXnJh8W1bj/BtyPy+sW93Mkf 8WLA== X-Gm-Message-State: AE9vXwOx6gUa0ypXMX+g4l9oR8QyilEi+sGsu4t7iDZlKygUVC33uP3WUtxmrlQwzvLmbg== X-Received: by 10.98.66.212 with SMTP id h81mr33742249pfd.51.1474848322688; Sun, 25 Sep 2016 17:05:22 -0700 (PDT) Received: from mail.outlook.com (ec2-52-37-37-96.us-west-2.compute.amazonaws.com. [52.37.37.96]) by smtp.gmail.com with ESMTPSA id m5sm25918524paw.40.2016.09.25.17.05.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 25 Sep 2016 17:05:21 -0700 (PDT) Date: Mon, 26 Sep 2016 00:05:14 +0000 (UTC) From: Lei Chang To: dev@hawq.incubator.apache.org, Kyle Dunn , Message-ID: In-Reply-To: References: Subject: Re: PXF question with HAWQInputFormat to migrate data 1.x -> 2.x MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_5685_240881826.1474848314295" X-Mailer: Outlook for iOS and Android archived-at: Mon, 26 Sep 2016 00:05:34 -0000 ------=_Part_5685_240881826.1474848314295 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I think it might be possible to use HAWQ 1.x MR Inputformt to develop a 2.0= pxf plugin. Then we =C2=A0do not need to run 2 versions together. Cheers Lei On Mon, Sep 26, 2016 at 4:27 AM +0800, "Goden Yao" wr= ote: + dev mailing list , modified the title. Hi Kyle. Based on your description, your scenario is (as I understand):1. HAWQ 1.x c= luster installed.2. HAWQ 2.x cluster installed in the same nodes3. Data mig= ration (ETL) from HAWQ 1.x files to HAWQ 2.x using PXF (from 2.x installati= on) Is that correct?So you want to develop a custom PXF plugin that can read HA= WQ 1.x parquet data as external tables on HDFS then Insert into new HAWQ 2.= x native table? According to 1.3 doc:http://hdb.docs.pivotal.io/131/topics/HAWQInputFormatf= orMapReduce.html#hawqinputformatexample=C2=A0 1) To use=C2=A0HAWQInputFormat, it'll require you also run HAWQ 1.x (as it = requires database URL to access metadata), so this mean you need to run 1.x= and 2.x side by side. In theory , it should be doable, but configuration w= ise, no one has tried this. 2) If you run hawq side by side, this means PXF will run side by side as we= ll - have to make sure there's no conflicts in ports or ambiguity of which = version PXF you are invoking. That's all I can think of for now.-Goden On Fri, Sep 23, 2016 at 12:10 PM Kyle Dunn wrote: Glad to hear Resolver is the only other piece - should work out nicely. So I'm looking at bolting on HAWQInputFormat to PXF (which actually looks q= uite straightforward) and I just want to ensure as many column types are su= pported as possible. This is motivated by needing to be able to read orphan= ed HAWQ 1.x files with PXF in HDB/HAWQ 2.x. Will make "in-place" upgrades m= uch simpler. Here is the list of datatypes HAWQInputFormat supports, and the potential m= apping to PXF types: On Fri, Sep 23, 2016 at 12:51 PM Goden Yao wrote: Thanks for the wishes.Are you talking about developing a new plugin (a new = data source).=C2=A0Mapping data type has 2 parts:1. what pxf recognized fro= m HAWQthis is=C2=A0https://github.com/apache/incubator-hawq/blob/master/pxf= /pxf-api/src/main/java/org/apache/hawq/pxf/api/io/DataType.java=C2=A02. wha= t plugins recognize and want to convert to HAWQ type. (Resolver)sample:=C2= =A0https://github.com/apache/incubator-hawq/blob/master/pxf/pxf-hive/src/ma= in/java/org/apache/hawq/pxf/plugins/hive/HiveResolver.java=C2=A0 basically, 1 provides a type list, and 2 select from that list to see which= data type should be converted to hawq recognized type.=C2=A0 If you're developing a new plugin with a new type mapping in HAWQ, you need= to do both 1 and 2.=C2=A0 Which specific primitive type you need which is not on the list?BTW, you ca= n also mail dev mailing list so answers will be archived in public for ever= yone :) -Goden On Fri, Sep 23, 2016 at 11:43 AM Kyle Dunn wrote: Hey Goden - I'm looking at extending PXF for a new data source and noticed only a subse= t of the HAWQ-supported primitive datatypes are implemented in PXF. Is this= as trivial as mapping a type to the corresponding OID in "api/io/DataType.= java" or is there something more I'm missing? Hope the new adventure is starting well. -Kyle--=20 Kyle Dunn | Data Engineering | PivotalDirect:=C2=A0303.905.3171=C2=A0| Emai= l:=C2=A0kdunn@pivotal.io --=20 Kyle Dunn | Data Engineering | PivotalDirect:=C2=A0303.905.3171=C2=A0| Emai= l:=C2=A0kdunn@pivotal.io ------=_Part_5685_240881826.1474848314295--