Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 33475200D5E for ; Fri, 8 Dec 2017 08:43:58 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 31D3E160BF2; Fri, 8 Dec 2017 07:43:58 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 59A04160C0D for ; Fri, 8 Dec 2017 08:43:57 +0100 (CET) Received: (qmail 96192 invoked by uid 500); 8 Dec 2017 07:43:56 -0000 Mailing-List: contact user-help@orc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@orc.apache.org Delivered-To: mailing list user@orc.apache.org Received: (qmail 96180 invoked by uid 99); 8 Dec 2017 07:43:56 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Dec 2017 07:43:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 051241807C6 for ; Fri, 8 Dec 2017 07:43:56 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id lNQma-dz8R5J for ; Fri, 8 Dec 2017 07:43:55 +0000 (UTC) Received: from mail-oi0-f45.google.com (mail-oi0-f45.google.com [209.85.218.45]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id AC2C45F256 for ; Fri, 8 Dec 2017 07:43:54 +0000 (UTC) Received: by mail-oi0-f45.google.com with SMTP id 184so6688704oii.2 for ; Thu, 07 Dec 2017 23:43:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=jTUbY73CTRT2eG0gmd+fXo1D3n3ANHM90Lkd0+ITE08=; b=R6oUEvTSb3NEHhoTMcEyne6d8+RtJ9n7Q9t0kgtfp/M2wkd2Z+e3zvvSwc4YkYvEzU chMKhuHOTsCRHCmuBvsmJ1nrgqNpHUDjIMZtxmZojDX5Iv5VuPs/f/Yc1B49IA4h4jBq PUGff32075p3MnQqiP7RuCuLcPhWsWyOtTCWUXvTgvTN6PTutqkc89yNEqI1rkev31YZ pVew1QcOiWJfjHDpl7zG2rOvrtj2lUXVS08lxrcNG8rWHuON8r5kCfVXfp8IudflGBfF EQklE2R0vLbksGkiJRg9Hi4FpXLLiIjEbeVlKVI3PL3uSzSSLokBBG8a3D13jAuECgKJ BFGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=jTUbY73CTRT2eG0gmd+fXo1D3n3ANHM90Lkd0+ITE08=; b=VOe5oXdPmFoL9MqdmUhOGQP4X49DoNVLRQ9wAJgJtrmTcj6OII76HmZH1m3XCD/K+d 09qxYhZY321B0a1K7ugI3pR4VxJs8MlIytqcCqXMaEdt9rzpDJIAMuRNP+/CaZ6qxHe1 O2tc2evVhEWaiAasolg98H/aetHSrc0S2ttg4vIDwX6kpMvWrJlYH6HlQP36vvhAGl// vBy9S51V21TpV/ACZ/vrWy1ACzDZHw2GYGHOnr+6Ki6FmpbOCRCbTJzGSLhODRjmTCTo RfBXXck0wP5nDMQHVTEyLJmd7XtatCdWcDW6T81JxVYZ9Cbxc9kI5hgWMGqHyPOqzl2Y sgMg== X-Gm-Message-State: AJaThX5WY0HpdwTmCS/TfCOWcQMU6VSefWlZ5oOSqXrIrys8OeHn/GmS Wz4Crkm+hJcAZ9MUa15xBo2vd73I6SL3KZNtyxI= X-Google-Smtp-Source: AGs4zMYizb5nbh3fD31PTNPUd6UHoA96kKkiE7hEpJGHSdf0xvei96eqksY4/fFkujj/U+h8747MD08vQbSIHtT8kjg= X-Received: by 10.202.77.198 with SMTP id a189mr23192410oib.203.1512719033908; Thu, 07 Dec 2017 23:43:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.168.145.12 with HTTP; Thu, 7 Dec 2017 23:43:53 -0800 (PST) In-Reply-To: References: From: Oleg Ruchovets Date: Fri, 8 Dec 2017 15:43:53 +0800 Message-ID: Subject: Re: convert avro to orc To: user@orc.apache.org Content-Type: multipart/alternative; boundary="001a1134fdc8a942cd055fcf5894" archived-at: Fri, 08 Dec 2017 07:43:58 -0000 --001a1134fdc8a942cd055fcf5894 Content-Type: text/plain; charset="UTF-8" Hello Owen. That is interesting. From your experience will it support hive external / managed table. My Idea was to prepare ORC object ( without HIVE ) and after that register it as external Hive table. Motivation is to prevent hive schema maintenance Thanks Oleg. On Thu, Dec 7, 2017 at 2:55 AM, Owen O'Malley wrote: > It would be a nice addition to the conversion tools. A first pass of > converting Avro schemas to ORC would be pretty easy with: > > boolean -> boolean > int -> int > long -> long > float -> float > double -> double > bytes -> binary > string -> string > enum -> string > fixed -> binary > map -> map > array -> array > record -> struct > union -> union > > with special handling for union -> X > > In terms of the conversion, you would just need to extend ConvertTool to > create RecordReaders for Avro. There are already examples of JSON and CSV. > > .. Owen > > > On Mon, Dec 4, 2017 at 11:31 PM, Oleg Ruchovets > wrote: > >> Hello. >> I wonder if there Utility to convert AVRO to ORC similar JSON to ORC >> ? >> >> Background of what I am doing: >> I am reading SQL data using NIFI. NIFI returns data in AVRO format. I >> want to store this data on s3 in ORC format and use it for hive external >> table. for that, I need to convert AVRO to ORC and derive hive schema. NIFI >> has component AVRO to ORC but it supports older version of HIVE and ORC. >> >> So the question how to convert AVRO to ORC and derive hive schema. I >> really like Utility that you guys build for JSON. it has both conversions >> to ORC and HIVE schema extraction. What is the way to achieve the same in >> case of AVRO format? >> >> Thanks >> Oleg. >> > > --001a1134fdc8a942cd055fcf5894 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello Owen.
=C2=A0 =C2=A0That is interesting.=C2=A0
From your experience will it support hive external / managed table.= =C2=A0
My Idea was to prepare ORC object ( without HIVE ) and aft= er that register it as external=C2=A0Hive table. Motivation is to prevent h= ive schema maintenance=C2=A0

Thanks
Oleg= .

On T= hu, Dec 7, 2017 at 2:55 AM, Owen O'Malley <owen.omalley@gmail.com= > wrote:
I= t would be a nice addition to the conversion tools. A first pass of convert= ing Avro schemas to ORC would be pretty easy with:

boolean -&= gt; boolean
int -> int
long -> long
float -> float
dou= ble -> double
bytes -> binary
string -> string
enum ->= ; string
fixed -> binary
map<X> -> map<s= tring,X>
array<X> -> array<X>
record&= lt;X,Y,Z> -> struct<X,Y,Z>
union<X,Y,Z> -> u= nion<X,Y,Z>

with special handling for union&= lt;null,X> -> X

In terms of the conversion, = you would just need to extend ConvertTool to create RecordReaders for Avro.= There are already examples of JSON and CSV.

.. Owen


On Mon, Dec 4, 2017 at 11:31 PM, Oleg= Ruchovets <oruchovets@gmail.com> wrote:
Hello.
=C2=A0 =C2=A0=C2=A0I wonder i= f there Utility to convert AVRO to ORC similar JSON to ORC ?=C2=A0

Background of what I am doing:
=C2=A0 =C2=A0I am= reading SQL data using NIFI. NIFI returns data in AVRO format. I want to s= tore this data on s3 in ORC format and use it for hive external table. for = that, I need to convert AVRO to ORC and derive hive schema. NIFI has compon= ent AVRO to ORC but it supports older version of HIVE and ORC.
So the question how to convert AVRO to ORC and derive hive sch= ema. I really like Utility that you guys build for JSON. it has both conver= sions to ORC and HIVE schema extraction.=C2=A0 What is the way to achieve t= he same in case of AVRO format?

Thanks
Oleg.


--001a1134fdc8a942cd055fcf5894--