Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0FAA17EC9 for ; Thu, 26 Feb 2015 17:22:50 +0000 (UTC) Received: (qmail 21250 invoked by uid 500); 26 Feb 2015 17:22:44 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 21148 invoked by uid 500); 26 Feb 2015 17:22:44 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 21138 invoked by uid 99); 26 Feb 2015 17:22:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Feb 2015 17:22:44 +0000 X-ASF-Spam-Status: No, hits=2.0 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,URI_NOVOWEL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of wget.null@gmail.com designates 74.125.82.179 as permitted sender) Received: from [74.125.82.179] (HELO mail-we0-f179.google.com) (74.125.82.179) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Feb 2015 17:22:18 +0000 Received: by wesq59 with SMTP id q59so13010911wes.1 for ; Thu, 26 Feb 2015 09:21:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-type:message-id:mime-version:subject:date:references :to:in-reply-to; bh=fBzXZAgo7nJB1pvmHx1YUE5UeZ+/wMB0B0XAAYOcfx8=; b=VknW/LB5ez97bpD36mqxtAiRiSStU/0gLzaBADnxhAeLa7wX9WFuqrWU24E/TYlOxo gjGqKbvuwqriBLVhPaDrnBKfzvizBQh4GTyi5Ky4yUVYCN0RLFQD6S5K77onEipg6WEt aUC+KABU9rApJF/6ietqIePMcd74/sfTmWa5e7OQMZlmtN7f+KLqwMez4eAZR6fuBDNo oyujaG0JGmNwyhfct7Ofq3FhonFVwUCBCAKuzBRsyYEuZ9yF51t7Ugrt3v9hzj3HLjPJ OsZgBg62B/QJe30qPH87DC0oGeK1GQZevaXJj0KEJK/U+qcGwU3rFck6+5iMBUCzh+QV CsCA== X-Received: by 10.194.239.129 with SMTP id vs1mr18594839wjc.158.1424971292117; Thu, 26 Feb 2015 09:21:32 -0800 (PST) Received: from donald.railnet.train ([88.128.80.57]) by mx.google.com with ESMTPSA id n2sm3712313wiw.16.2015.02.26.09.21.29 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 26 Feb 2015 09:21:30 -0800 (PST) From: Alexander Alten-Lorenz Content-Type: multipart/alternative; boundary="Apple-Mail=_C344996E-2130-4583-BD56-522FF26BF707" Message-Id: <74CAF5C7-AB45-4001-A2CA-C51F916886F9@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why? Date: Thu, 26 Feb 2015 18:21:26 +0100 References: <481B59B8276CBD4C95DC1C2D87D09FF63AAB1195@USW20015568.gbl.ad.hedani.net> To: user@hadoop.apache.org In-Reply-To: <481B59B8276CBD4C95DC1C2D87D09FF63AAB1195@USW20015568.gbl.ad.hedani.net> X-Mailer: Apple Mail (2.2070.6) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_C344996E-2130-4583-BD56-522FF26BF707 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi, Impala is a product of Cloudera. You might request help per: https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user = BR,=20 Alex > On 26 Feb 2015, at 17:15, Vitale, Tom = wrote: >=20 > I used sqoop to import an MS SQL Server table into an Avro file on = HDFS. No problem. Then I tried to create an external Impala table using = the following DDL: > =20 > CREATE EXTERNAL TABLE AvroTable > STORED AS AVRO > LOCATION '/tmp/AvroTable'; > =20 > I got the error =E2=80=9CERROR: AnalysisException: Error loading Avro = schema: No Avro schema provided in SERDEPROPERTIES or TBLPROPERTIES for = table: default.AvroTable=E2=80=9D > =20 > So I extracted the schema from the Avro file using the = avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the = recommendation above, changed the DDL to point to it: > =20 > CREATE EXTERNAL TABLE AvroTable > STORED AS AVRO > LOCATION '/tmp/AvroTable' > TBLPROPERTIES( > 'serialization.format'=3D'1', > = 'avro.schema.url'=3D'hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.s= chema' > ); > =20 > This worked fine. But my question is, why do you have to do this? = The schema is already in the Avro file =E2=80=93 that=E2=80=99s where I = got the JSON schema file that I point to in the TBLPROPERTIES parameter! > =20 > Thanks, Tom > =20 > Tom Vitale > CREDIT SUISSE > Information Technology | Infra Arch & Strategy NY, KIVP > Eleven Madison Avenue | 10010-3629 New York | United States > Phone +1 212 538 0708 > thomas.vitale@credit-suisse.com = | www.credit-suisse.com = > =20 >=20 >=20 >=20 > = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D > Please access the attached hyperlink for an important electronic = communications disclaimer: > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html = > = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D --Apple-Mail=_C344996E-2130-4583-BD56-522FF26BF707 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
Hi,

Impala is a product of Cloudera. You = might request help per:

BR, 
 Alex


On 26 Feb 2015, at 17:15, = Vitale, Tom <thomas.vitale@credit-suisse.com> wrote:

I used sqoop to import an MS SQL = Server table into an Avro file on HDFS.  No problem. Then I tried = to create an external Impala table using the following DDL:
 
CREATE EXTERNAL TABLE AvroTable
STORED AS AVRO
        LOCATION = '/tmp/AvroTable';
 
I got the error =E2=80=9CERROR: = AnalysisException: Error loading Avro schema: No Avro schema provided in = SERDEPROPERTIES or TBLPROPERTIES for table: default.AvroTable=E2=80=9D
 
So I extracted the schema from the Avro file using the = avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the = recommendation above, changed the DDL to point to it:
 
CREATE EXTERNAL TABLE AvroTable
STORED AS AVRO
        LOCATION = '/tmp/AvroTable'
TBLPROPERTIES(
        = 'serialization.format'=3D'1',
        = 'avro.schema.url'=3D'hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema'=
);
 
This worked fine.  But my question is, why do you have = to do this?  The schema is already in the Avro file =E2=80=93 = that=E2=80=99s where I got the JSON schema file that I point to in the = TBLPROPERTIES parameter!
 
Thanks, Tom
 
Tom Vitale
CREDIT SUISSE
Information Technology | Infra Arch & = Strategy NY, KIVP
Eleven Madison Avenue | 10010-3629 New = York | United States
Phone +1 212 538 = 0708
 



=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D
Please access the attached hyperlink = for an important electronic communications disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html<= /a>
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D

= --Apple-Mail=_C344996E-2130-4583-BD56-522FF26BF707--