Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F4CF177D2 for ; Wed, 27 May 2015 05:04:17 +0000 (UTC) Received: (qmail 77675 invoked by uid 500); 27 May 2015 05:04:17 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 77594 invoked by uid 500); 27 May 2015 05:04:17 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 77582 invoked by uid 99); 27 May 2015 05:04:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 May 2015 05:04:17 +0000 Date: Wed, 27 May 2015 05:04:17 +0000 (UTC) From: "lovekesh bansal (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-10830) First column of a Hive table created with LazyBinaryColumnarSerDe is not read properly MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 lovekesh bansal created HIVE-10830: -------------------------------------- Summary: First column of a Hive table created with LazyBinaryC= olumnarSerDe is not read properly Key: HIVE-10830 URL: https://issues.apache.org/jira/browse/HIVE-10830 Project: Hive Issue Type: Bug Reporter: lovekesh bansal 1. create external table platdev.table_target ( id INT, message String, sta= te string, date string ) partitioned by (country string) row format delimit= ed fields terminated by ',' stored as RCFILE location '/user/nikgupta/table= _target' ; 2. insert overwrite table platdev.table_target partition(country) select ca= se when id=3D13 then 15 else id end,message,state,date,country from platdev= .table_base2 where id between 13 and 16; \n" say now my table has the following data: 15=09thirteen=09delhi=09 2-12-2014=09india 14=09fourteen=09delhi=09 1-1-2014=09 india 15=09fifteen=09florida=091-1-2014=09 us 16=09sixteen=09florida=092-12-2014=09us Now If I try to read the data with a mapreduce program, with map function a= s given below: public void map(LongWritable key, BytesRefArrayWritable val, Context contex= t) throws IOException, InterruptedException { =20 for (int i =3D 0; i < val.size(); i++) { BytesRefWritable bytesRefread =3D val.get(i); byte[] currentCell =3D Arrays.copyOfRange(bytesRefread.getData(), byte= sRefread.getStart(), bytesRefread.getStart()+bytesRefread.getLength()); Text currentCellStr =3D new Text(currentCell); System.out.println("rowText=3D"+currentCellStr=09); } context.write(NullWritable.get(), bytes); } and set the following job configuration parameters:-=20 job.setInputFormatClass(RCFileMapReduceInputFormat.class); job.setOutputFormatClass(RCFileMapReduceOutputFormat.class); jobConf.setInt(RCFile.COLUMN_NUMBER_CONF_STR, 5) =20 The output shown is as follows: rowText=3D=0F rowText=3Dfifteen rowText=3Dgoa rowText=3D2-2-2222 rowText=3Dus But exactly the same case using the ColumnarSerDe explicitly in the table d= efinition would give the following output: rowText=3D=0F1 rowText=3Dfifteen rowText=3Dgoa rowText=3D2-2-2222 rowText=3Dus Point is that First column value is missing.=20 -- This message was sent by Atlassian JIRA (v6.3.4#6332)