Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C776617B50 for ; Thu, 26 Feb 2015 16:17:58 +0000 (UTC) Received: (qmail 83023 invoked by uid 500); 26 Feb 2015 16:17:49 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 82886 invoked by uid 500); 26 Feb 2015 16:17:49 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 82876 invoked by uid 99); 26 Feb 2015 16:17:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Feb 2015 16:17:49 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_PASS,URI_NOVOWEL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [199.53.195.151] (HELO mail-gw32.credit-suisse.com) (199.53.195.151) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Feb 2015 16:17:24 +0000 Received: by mail-gw32.credit-suisse.com; Thu, 26 Feb 2015 16:15:12 GMT From: "Vitale, Tom " To: "user@hadoop.apache.org" Subject: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why? Thread-Topic: Impala CREATE TABLE AS AVRO Requires "Redundant" Schema - Why? Thread-Index: AdBR3iG9iLR4eZCXTn+grXdWJeZHPA== Date: Thu, 26 Feb 2015 16:15:04 +0000 Message-ID: <481B59B8276CBD4C95DC1C2D87D09FF63AAB1195@USW20015568.gbl.ad.hedani.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [169.36.96.254] Content-Type: multipart/alternative; boundary="_000_481B59B8276CBD4C95DC1C2D87D09FF63AAB1195USW20015568gbla_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_481B59B8276CBD4C95DC1C2D87D09FF63AAB1195USW20015568gbla_ Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I used sqoop to import an MS SQL Server table into an Avro file on HDFS. No problem. Then I tried to create an external Impala table using the following DDL: CREATE EXTERNAL TABLE AvroTable STORED AS AVRO LOCATION '/tmp/AvroTable'; I got the error "ERROR: AnalysisException: Error loading Avro schema: No Avro schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: default.AvroTable" So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the recommendation above, changed the DDL to point to it: CREATE EXTERNAL TABLE AvroTable STORED AS AVRO LOCATION '/tmp/AvroTable' TBLPROPERTIES( 'serialization.format'='1', 'avro.schema.url'='hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema' ); This worked fine. But my question is, why do you have to do this? The schema is already in the Avro file - that's where I got the JSON schema file that I point to in the TBLPROPERTIES parameter! Thanks, Tom Tom Vitale CREDIT SUISSE Information Technology | Infra Arch & Strategy NY, KIVP Eleven Madison Avenue | 10010-3629 New York | United States Phone +1 212 538 0708 thomas.vitale@credit-suisse.com | www.credit-suisse.com =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html =============================================================================== --_000_481B59B8276CBD4C95DC1C2D87D09FF63AAB1195USW20015568gbla_ Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit

I used sqoop to import an MS SQL Server table into an Avro file on HDFS.  No problem. Then I tried to create an external Impala table using the following DDL:

 

CREATE EXTERNAL TABLE AvroTable

STORED AS AVRO

        LOCATION '/tmp/AvroTable';

 

I got the error “ERROR: AnalysisException: Error loading Avro schema: No Avro schema provided in SERDEPROPERTIES or TBLPROPERTIES for table: default.AvroTable”

 

So I extracted the schema from the Avro file using the avro-tools-1.7.4.jar (-getschema) into a JSON file, then per the recommendation above, changed the DDL to point to it:

 

CREATE EXTERNAL TABLE AvroTable

STORED AS AVRO

        LOCATION '/tmp/AvroTable'

TBLPROPERTIES(

        'serialization.format'='1',

        'avro.schema.url'='hdfs://xxxxxxxx.xxxxxxxx.xxxxxxxx.net/tmp/AvroTable.schema'

);

 

This worked fine.  But my question is, why do you have to do this?  The schema is already in the Avro file – that’s where I got the JSON schema file that I point to in the TBLPROPERTIES parameter!

 

Thanks, Tom

 

Tom Vitale

CREDIT SUISSE

Information Technology | Infra Arch & Strategy NY, KIVP

Eleven Madison Avenue | 10010-3629 New York | United States

Phone +1 212 538 0708

thomas.vitale@credit-suisse.com | www.credit-suisse.com

 




==============================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==============================================================================

--_000_481B59B8276CBD4C95DC1C2D87D09FF63AAB1195USW20015568gbla_--