Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C8EF7200C00 for ; Wed, 18 Jan 2017 14:31:43 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id C784C160B44; Wed, 18 Jan 2017 13:31:43 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EC424160B3A for ; Wed, 18 Jan 2017 14:31:42 +0100 (CET) Received: (qmail 85572 invoked by uid 500); 18 Jan 2017 13:31:32 -0000 Mailing-List: contact dev-help@sqoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@sqoop.apache.org Delivered-To: mailing list dev@sqoop.apache.org Received: (qmail 85554 invoked by uid 99); 18 Jan 2017 13:31:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jan 2017 13:31:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 962251A05DE for ; Wed, 18 Jan 2017 13:31:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.196 X-Spam-Level: X-Spam-Status: No, score=-1.196 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999, URIBL_BLOCKED=0.001, URI_TRY_3LD=0.001, WEIRD_PORT=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id OTedWctRmV8J for ; Wed, 18 Jan 2017 13:31:30 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id A36695FDC3 for ; Wed, 18 Jan 2017 13:31:29 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 107F4E7379 for ; Wed, 18 Jan 2017 13:31:28 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 6B8F725287 for ; Wed, 18 Jan 2017 13:31:26 +0000 (UTC) Date: Wed, 18 Jan 2017 13:31:26 +0000 (UTC) From: "Liz Szilagyi (JIRA)" To: dev@sqoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (SQOOP-3088) Sqoop export with Parquet data failure does not contain the MapTask error MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 18 Jan 2017 13:31:44 -0000 [ https://issues.apache.org/jira/browse/SQOOP-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liz Szilagyi reassigned SQOOP-3088: ----------------------------------- Assignee: Liz Szilagyi > Sqoop export with Parquet data failure does not contain the MapTask error > ------------------------------------------------------------------------- > > Key: SQOOP-3088 > URL: https://issues.apache.org/jira/browse/SQOOP-3088 > Project: Sqoop > Issue Type: Bug > Components: tools > Reporter: Markus Kemper > Assignee: Liz Szilagyi > > *Test Case* > {noformat} > ################# > # STEP 01 - Setup Table and Data > ################# > export MYCONN=jdbc:oracle:thin:@oracle.cloudera.com:1521/orcl12c; > export MYUSER=sqoop > export MYPSWD=cloudera > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "drop table t1_oracle" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "create table t1_oracle (c1 int, c2 varchar(10))" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into t1_oracle values (1, 'data')" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "select * from t1_oracle" > Output: > ------------------------------------- > | C1 | C2 | > ------------------------------------- > | 1 | data | > ------------------------------------- > ################# > # STEP 02 - Import Data as Parquet > ################# > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1_ORACLE --target-dir /user/user1/t1_oracle_parquet --delete-target-dir --num-mappers 1 --as-parquetfile > Output: > 16/12/21 07:11:47 INFO mapreduce.ImportJobBase: Transferred 1.624 KB in 50.1693 seconds (33.1478 bytes/sec) > 16/12/21 07:11:47 INFO mapreduce.ImportJobBase: Retrieved 1 records. > ################# > # STEP 03 - Verify Parquet Data > ################# > hdfs dfs -ls /user/user1/t1_oracle_parquet/*.parquet > parquet-tools schema -d hdfs://namenode.cloudera.com/user/user1/t1_oracle_parquet/a6ba3dda-b5fc-42d7-9555-5837a12a036b.parquet > Output: > -rw-r--r-- 3 user1 user1 597 2016-12-21 07:11 /user/user1/t1_oracle_parquet/a6ba3dda-b5fc-42d7-9555-5837a12a036b.parquet > --- > message T1_ORACLE { > optional binary C1 (UTF8); > optional binary C2 (UTF8); > } > creator: parquet-mr version 1.5.0-cdh5.8.3 (build ${buildNumber}) > extra: parquet.avro.schema = {"type":"record","name":"T1_ORACLE","doc":"Sqoop import of T1_ORACLE","fields":[{"name":"C1","type":["null","string"],"default":null,"columnName":"C1","sqlType":"2"},{"name":"C2","type":["null","string"],"default":null,"columnName":"C2","sqlType":"12"}],"tableName":"T1_ORACLE"} > file schema: T1_ORACLE > ------------------------------------------------------------------------------------------------------------------------ > C1: OPTIONAL BINARY O:UTF8 R:0 D:1 > C2: OPTIONAL BINARY O:UTF8 R:0 D:1 > row group 1: RC:1 TS:85 > ------------------------------------------------------------------------------------------------------------------------ > C1: BINARY SNAPPY DO:0 FPO:4 SZ:40/38/0.95 VC:1 ENC:PLAIN,RLE,BIT_PACKED > C2: BINARY SNAPPY DO:0 FPO:44 SZ:49/47/0.96 VC:1 ENC:PLAIN,RLE,BIT_PACKED > ################# > # STEP 04 - Export Parquet Data > ################# > sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1_ORACLE --export-dir /user/user1/t1_oracle_parquet --num-mappers 1 --verbose > Output: > [sqoop debug] > 16/12/21 07:15:06 INFO mapreduce.Job: map 0% reduce 0% > 16/12/21 07:15:40 INFO mapreduce.Job: map 100% reduce 0% > 16/12/21 07:15:40 INFO mapreduce.Job: Job job_1481911879790_0026 failed with state FAILED due to: Task failed task_1481911879790_0026_m_000000 > Job failed as tasks failed. failedMaps:1 failedReduces:0 > 16/12/21 07:15:40 INFO mapreduce.Job: Counters: 8 > Job Counters > Failed map tasks=1 > Launched map tasks=1 > Data-local map tasks=1 > Total time spent by all maps in occupied slots (ms)=32125 > Total time spent by all reduces in occupied slots (ms)=0 > Total time spent by all map tasks (ms)=32125 > Total vcore-seconds taken by all map tasks=32125 > Total megabyte-seconds taken by all map tasks=32896000 > 16/12/21 07:15:40 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead > 16/12/21 07:15:40 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 46.8304 seconds (0 bytes/sec) > 16/12/21 07:15:40 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead > 16/12/21 07:15:40 INFO mapreduce.ExportJobBase: Exported 0 records. > 16/12/21 07:15:40 DEBUG util.ClassLoaderStack: Restoring classloader: java.net.FactoryURLClassLoader@577cfae6 > 16/12/21 07:15:40 ERROR tool.ExportTool: Error during export: Export job failed! > [yarn debug] > 2016-12-21 07:15:38,911 DEBUG [Thread-11] org.apache.sqoop.mapreduce.AsyncSqlOutputFormat: Committing transaction of 0 statements > 2016-12-21 07:15:38,914 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file hdfs://nameservice1/user/user1/t1_oracle_parquet/a6ba3dda-b5fc-42d7-9555-5837a12a036b.parquet > at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:241) > at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:227) > at org.kitesdk.data.spi.filesystem.AbstractCombineFileRecordReader.nextKeyValue(AbstractCombineFileRecordReader.java:68) > at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.nextKeyValue(CombineFileRecordReader.java:69) > at org.kitesdk.data.spi.AbstractKeyRecordReaderWrapper.nextKeyValue(AbstractKeyRecordReaderWrapper.java:55) > at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) > at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) > at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.ClassCastException: T1_ORACLE cannot be cast to org.apache.avro.generic.IndexedRecord > at parquet.avro.AvroIndexedRecordConverter.start(AvroIndexedRecordConverter.java:185) > at parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:391) > at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:216) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)